DOCUMENT RESUME 



ED 041 752 



SE 009 204 



AUTHOR 

TITLE 

INSTITUTION 
SPONS AGENCY 
PUB DATE 
NOTE 



EDRS PRICE 
DESCRIPTORS 



Borers, Way \ie A. 

Distributions. 

Commission on Coll. Physics, College Park, Ma. 
National Science Foundation, Washington, D.C. 

65 

48p.; Monograph written for the Conference on the 
New Instructional Materials in Physics (University 
of Washington, Seattle, 1965) 

EDRS Price MF-S0.25 HC-$2.50 
♦College Science, *Instructional Materials, 
Measurement, *Physics, *Probability Theory, Resource 
Materials, *Statistics 



ABSTRACT 

This monograph was written for the Conference of the 
New Instructional Materials in Physics, held at the University of 
Washington in summer, 1965. It is intended for students who have had 
an introductory college physics course. It seeks to provide an 
introduction to the idea of distributions in general, and to some 
aspects of the subject in physics. There are three chapters. Chapter 
1 gives a non-mathematical treatment of distributions. Chapter 2 
considers means, other averages standard deviation, and the binomia 
distribution. Chapter 3 concerns continuous distributions and 
requires students to be familiar with elementary calculus. Problems 
are presented at the end of each chapter. (1C) 
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This Monograph was written for the Conference on the New Instructional 
Materials in Physics, held at the University of Washington in the sum- 
■er of 1965. The general purpose of the conference was to create effec- 
tive ways of presenting physics to college students who are not pre- 
paring to bee owe professional physicists. Such an audience Might include 
prospective secondary school physics teachers, prospective practitioners 
of other sciences, and those who wish to learn physics as one conponent 
of a liberal education. 

At the Conference sone 40 physicists and 12 filnMakers and design- 
ers worked for periods ranging fron four to nine weeks. The central 
task, certainly the one in which Most physicists participated, was the 
writing of Monographs. 

Although there was no consensus on a single approach, Many writers 
felt that their presentations ought to put More than the custosary 
enphasis on physical insight and synthesis. Moreover, the treatMent was 
to be "Multi-level" that is, each Monograph would consist of sev- 

eral sections arranged in increasing order of sophistication. Such 
papers, it was hoped, could be readily introduced into existing courses 
or provide the basis for new kinds of courses. 

Monographs were written in four content areas: Forces and Fields, 
Quantun Mechanics, Thernal and Statistical Physics, and the Structure 
and Properties of Matter,. Topic selections and general outlines were 
only loosely coordinated within each area in order to leave authors 
free to invent new approaches. In point of fact, however, a nunber of 
Monographs do relate to others in conpleMentary ways, a result of their 
authors' close, infornal interaction. 

Because of stringent tine limitations, few of the Monographs have 
been completed, and none has been extensively rewritten. Indeed, Most 
writers feel that they are barely More than clean first drafts. Tet, 
because of the highly experinental nature of the undertaking, it is 
essential that these Manuscripts be nade available for careful review 



by other physicists and for trial use with students. Much effort, 
therefore, has gone into publishing then in a readable format Intended 
to facilitate serious consideration. 

So aany people have contributed to the project that complete 
acknowledgement is not possible. The National Science Foundation sup- 
ported the Conference. The staff of the Commission on College Physics, 
led by E. Leonard Jossem, and that of the University of Washington 
physics department, led by Ronald Geballe and Ernest M. Henley, car- 
ried the heavy burden of organisation. Walter C. Michels, Lyman G. 
Parratt , and George M. Yolkoff read and criticized manuscripts at a 
critical stage in the writing. Judith Bregman, Edward Gerjuoy, Ernest 
M. Henley, a nd Lawrence Wilets read manuscripts editorially. Ma.'tha 
mil and Margery Lang did the technical editing; Ann Widditsch 
supervised the initial typing and assembled the final drafts. James 
Grunbaum designed the format and, assisted in Seattle by Roselyn Pape, 
directed the art preparation. Richard A. Mould has helped in all phases 
of readying manuscripts for the printer. Finally, and crucially, Jay F. 
Wilson, of the D. Van Nostrand Company, served as Managing Editor. For 
the hard work and steadfast support of all these persons and many 
others, I am deeply grateful. 

Edward D. Larnbe 
Chairman, Panel on the 
New Instructional Materials 
Commission on College Physics 
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PREFACE 

This monograph is intended to provide an introduction to the idea of 
distributions in general , and to some aspects of the subject of impor- 
tance in physics. The level is intended to be suitable for students who 
have had an introductory college physics course, although only very 
little knowledge of physics is actually required. The first chapter is 
entirely nonmat hematical, the second is intended to be understandable 
to students who have not had a calculus course, and the third chapter 
requires familiarity with elementary calculus. A fourth chapter is 
planned which treats applications in statistical mechanics, but that 
chapter is not included in the present publication. 

It is my hope that the material may be helpful in giving a some- 
what more detailed introduction to certain statistical and probabilis- 
tic ideas than commonly occurs in the standard physics texts, and that 
it may therefore find some use in preparing students not only for the 
study of kinetic theory and statistical mechanics, but also for other 
areas of physics where, these ideas are useful. 



Vayne A. Bowers 
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INTRODUCTION 



1.1 THE IDEA OF A DISTRIBUTION 

Let us imagine ourselves capable 
of seeing individual molecules flying 
about in a gas. Suppose we fix our at- 
tention on a small region, and consid- 
er the motion of the molecules as they 
move across the region. Suppose we are 
asked: "How fast do they travel?" On 
looking at several molecules, we find 
that some travel slowly, some fast; 
perhaps we find a tendency for a cer- 
tain range of speeds to predominate, 
but nevertheless values outside such a 
range occur from time to time. 

Again we look at the gas; this 
time we follow an individual molecule 
in its path. It collides with another 
molecule, changes its direction and 
\ speed, goes on to another collision, 
strikes the wall of the vessel and 
bounces off, collides with still a 
third molecule - and so on. Suppose we 
are asked : "How far does it travel be- 
tween collisions?" Again there is ‘no 
single answer. Sometimes it travels a 
micron, sometimes much less; the dis- 
tances vary widely. 

Or again, suppose we are simply 
asked: "How many molecules are there in 
a cubic micron?" If there is a reason- 
ably high vacuum in the system, there 
may be a density of about ten molecules 
per cubic micron; but again, as we 
watch any particular cubic micron of 
volume, we see now eight, now thirteen, 
now eleven, now seven molecules. The 
number we see varies "randomly" about 
an average value. 

The questions asked about the 
molecules in each of the examples given 
have this one thing in common with each 
other and in common with a host of 
other questions which arise in physics 
and in the other natural sciences and 
the social sciences. They cannot be an- 
swered with a single number, but only 
with a whole range of numbers. In this 
respect they are in contrast with such 



questions as: "What is the speed of 
light in vacuo?" and "What is the tem- 
perature of pure boiling water under 
normal atmospheric pressure?", which 
have precise numerical answers (ignor- 
ing very small uncertainties which de- 
crease further with each improvement 
in the experimental apparatus) . The 
first type of question can be given 
only such answers as : "Out of 200 mole- 
cules, 52 had speeds^ between 0 and 300 
meters/sec, 89 had speeds between 300 
and 600 meters/sec, and the remaining 
59 had speeds greater than 600 meters/ 
sec . " 

This kind of an answer we call a 
distribution because it tells how the 
molecules are distributed with respect 
to the property of interest - speed, in 
this case. Such distributions arise in 
every area of physics. The time of de- 
cay of a radioactive nucleus, the angle 
of scattering of a neutron colliding 
with a carbon nucleus, the position of 
an electron in a hydrogen atom, the 
energy of the beta particle emitted in 
the radioactive decay of a nuclear spe- 
cies - all of these and* a host of 
others are described by distributions 
rather than by single numbers. In the 
social sciences, perhaps even more than 
in physics, distributions are ubiqui- 
tous; such familiar examples as the 
distributions of income, of life ex- 
pectancy , or of education come to mind . 
Not only is the determination of such 
distributions the aim of much research 
in the social sciences; once deter- 
mined, they form the essential factual 
base for further economic and socio- 
logical work. 

To define a distribution we must 
first specify a population (molecules 
o t' oxygen gas at normal pressure and 
temperature; two-mil lion -volt neutrons 
scattered from carbon nuclei; students 
in a certain physics course), and a 
characteristic or property of the in- 
dividuals comprising the population 
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which can be measured (speed of the 
molecule; angle of scattering of the 
neutron; final examination grade of the 
student). A table, a graph, or a math- 
ematical function telling how many of 
the population have specified values of 
the property in question then consti- 
tutes the distribution. More explic- 
itly, it is sometimes called a fre- 
quency distribution , since it gives the 
frequency of occurrence of the speci- 
fied values of the property in ques- 
tion. 

We will distinguish between dis- 
crete and continuous distributions. By 
a discrete distribution we will mean 
one for which a finite number of cate- 
gories are used for specifying the 
property in question. Thi^ may happen 
in two ways. First, the property may 
be intrinsically discrete, as in the 
example described earlier of the num- 
ber of molecules in a cubic micron, 
which is necessarily an integer. (Many 
distributions arising in probability 
theory are of this type. The number of 
heads in a sequence of coin tosses and 
the number arising in the throw of a 
pair of dice have such distributions.) 
Second, the property in question, al- 
though having in principle a continu- 
ous range of values, may be divided 
into a finite number of intervals for 
convenience. Thus the range of scat- 
tering angles for the neutron extends 
continuously from 0° to 180° , byt it 
may be divided into eighteen 10° in- 
tervals or thirty-six 5° intervals for 
specifying the observed distribution. 
The scale of such a division may be 
determined in part by the instrumental 
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limitations (perhaps intervals less 
than 5° cannot be accurately defined 
by the particular apparatjs), and in 
part by the amount of dat;i accumulated. 

By a continuous distribution, we 
will mean one for which tie full con- 
tinuous range of values of the property 
is used, in the sense that the inter- 
vals which would characterize a dis- 
crete distribution are allowed to be- 
come arbitrarily small. To describe 
adequately how this is doae, the meth- 
ods of the calculus must be used. We 
will therefore postpone detailed dis- 
cussion of continuous distributions to 
Chapter 3. We may remark, however, that 
in principle an infinite population 
would be needed to specify a continu- 
ous distribution, for as the intervals 
are taken smaller, their number in- 
creases. Thus with any finite popula- 
tion, there is a limit to the possible 
decrease in interval size. Neverthe- 
less, the populations which enter in 
many physical problems are so enormous 
(as for example che 10 1# molecules in a 
cubic centimeter of gas) that they are 
effectively infinite, and the methods 
of continuous distributions may be used 
without trouble. 



1.2 GRAPHICAL REPRESENTATION OF 
DISTRIBUTION 

Discrete distributions may be 
represented graphically in various 
ways. We shall use two slightly differ- 
ent methods, one of which is appropri- 
ate for the ’’intrinsically discrete” 

K distributions discussed in the previous 
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VELOCITY INTERVAL, 
METERS PER 
SECOND 


NUMBER OF ! 

MOLECULES WITH 
GIVEN VELOCITY 


0-100 


4 T 


100 - 200 


16 


200 - 300 


35 


300 - 400 


44 


400 - 500 


37 


500 - 600 


28 


600 - 700 


17 


700 - 800 


11 


800 - 900 


6 


900 -1000 


2 




200 



Table 1.2 



section, and the other for the distri- 
butions of continuously variable quan- 
tities whose range is divided into in- 
tervals. For the former we use a bar 
graph , for the latter a histogram . Each 
is essentially a plot of frequency of 
occurrence vertically against the pos- 
sible values of the quantity horizon- 
tally. The bar graph uses vertical 
lines at the positions of the discrete 
index (usually an integer), since val- 
ues between the discrete indices are 
meaningless. The histogram uses rectan- 
gles of the appropriate heights erected 
on each lntorval, sinco tho individual 
values may have occurred anywhere in 
the interval into which they have been 
grouped. 

We will give an example of each 
type of graphical representation. In 
Table 1.1 and Fig. 1,1, we have a ta- 
ble, and the corresponding bar graph, 
giving the frequency of occurrence of 
various numbers of "heads" in a series 
of fifty tosses of a group of four 
coins . The possible outcomes are of 
course 0, 1, 2, 3, or 4 heads in each 
toss, so a bar graph is appropriate. 

In Table 1.2 and Fig. 1.2, we have a 
table and the corresponding histogram 
giving the distribution of speeds, in 
intervals of 100 meters/sec, of 200 
molecules. 

Although it contains no more in- 




Fig. 1.2 



formation than the table from which it 
is constr< > ed , the bar graph or his- 
togram is useful in giving a quick 
qualitative impression of a distribu- 
tion. It can readily give a visual com- 
parison of two distributions, or of an 
observed distribution with a theoreti- 
cal or calculated one. But for more 
quantitative information, such as aver- 
ages and other numbers associated with 
the distribution, one must usually re- 
fer to the data in the table. 



1.3 CUMULATIVE DISTRIBUTIONS 

Occasionally a slightly different 
arrangement of the same iKlormation is 
useful. We can, instead of giving the 
numbers of molecules in each velocity 
interval as in Table 1.2, give the to- 
tal number of molecules with velocity 
less than 100, 200, ... meters/ sec. 

Such a specification is known as a cu - 
mulative distribution. In Table 1.3 and 
Fig. 1.3 (see next page), the same 
data as in Table 1.2 are given in this 
fashion. Evidently the two types of 
distribution are easily obtained from 
one another; for example, the differ- 
ences between successive entries in 
the table for the cumulative distribu- 
tion give the entries for the corre- 
sponding intervals of the original fre- 
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VELOCITY 
METERS PER 
SECOND 


NUMBER OF 
MOLECULES WITH 
LESS VELOCITY 
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200 


20 
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quency distribution. Notice that a cu- 
mulative distribution cannot decrease 
as one goes up the scale of the meas- 
ured property. In fact, it must ulti- 
mately increase from zero to the num- 
ber in the population. 

A different way of specifying a 
cumulative distribution is familiar in 
the treatment of test scores. If a stu- 
dent is told he stands in the "78th 
percentile" on a certain test, he knows 
that his score is higher than that of 
78% of the students taking the test. 
This language (percentiles, or deciles, 
or quartiles) corresponds to using 
equal intervals (hundredths, tenths, 
quarters) of the total range of the 
vertical, or number, axis of such a 
graph as that of Fig. 1.3, rather than 
of the horizontal axis. One is asking, 
in effect, not how many aolecules have 
velocities lying in the various equal 
intervals, but rather what intervals 
of velocity correspond to equal numbers 
of molecules - one percent, or one 
tenth, or one quarter, respectively, 
of the total number. 



1.4 JOINT DISTRIBUTIONS 

In all the examples cited so far, 
one property of the individuals com- 
prising the population has been singled 



out for attention: the speed of the 
molecules, the grades of the students, 
the angle of scattering of the neu- 
trons. But the individuals also have 
other properties. The molecules have 
position and direction of motion as 
well as speed; the students have ages, 
heights, and blood pressures as well 
as exam grades; the neutrons have en- 
ergies and momenta as well as angles of 
scattering. The various properties may 
be related to one another, or they may 
be quite independent The molecule’s 
speed - but not its direction! - is 
closely related to its kinetic energy; 
the student's height is unrelated to 
his examination grade, but is somewhat 
related to his weight. In either case, 
we may define a joint distribution oi 
two or more such characteristics. By 
this we mean a listing, by table, or 
graph, or mathematical function, of the 
number of individuals of the population 
that have simultaneously certain speci- 
fied values of each of the two or more 
characteristics . 

We give in Table 1.4 and Fig. 1.4 
a rather prosaic example: a joint dis- 
tribution of heights and weights of 
5000 men. Notice that a two-dimensional 
array rather than a column is needed 
for the table, and a three-dimensional 
histogram for the figure. The difficul- 
ties of pictorially representing joint 
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distributions for more than two charac- 
teristics are apparent! Nevertheless 
they are of great importance in phys- 
ics. For example, the Joint distribu- 
tion in position and speed of a mole- 
cule, and the Joint distribution of the 
three components of the velocity of a 
molecule, are basic distributions in 
the kinetic theory of gases. 



1.5 PROBABILITY DISTRIBUTIONS; 
FLUCTUATIONS 



same. The resulting frequency distri- 
bution will of course be different from 
the first; but if we again convert to 
relative frequency, we expect the frac- 
tions in the new distribution to be not 
too different from those in the origi- 
nal set. Experience shows that as we 
accumulate more and more data corre- 
sponding to larger and larger popula- 
tions, the fractions giving the rela- 
tive frequencies of the various altern- 
atives tend to approach limits. These 
limiting values we call the probabili - 



Each of the frequency distribu- 
tions we have studied may be converted 
to a relative frequency distribution by 
dividing each entry in the table speci- 
fying the distribution by the total 
number in the population. We will thus 
obtain a table consisting of fractions 
whose sum is unity . Each entry in the 
new table will give the fraction of the 
total population which lies in the 
specified interval. Now suppose we make 
another set of measurements of the same 
kind on a new population of the sane 
type, whose total number is not the 



$ 

t 

WEIGHT 


125- 

ISO 

LBS. 


ISO- 

175 

LBS. 


175- 

200 

LBS. 


200- 

225 

LBS. 


225- 

250 

LBS. 


TOTALS 


5' -5 r r 


9 


22 


19 


1 


0 { 


| 51 


5'r- 5'6* | 


46 


158 


168 


76 


27 1 


| 475 


5'er-5'r 


51 


227 


423 


332 


89 ] 


] 1122 


5'r-efor 


62 


309 


768 


683 


197 | 


2019 


ercr-vr 


21 


212 


302 


453 


120 | 


1 1196 


arar-wr 


0 


• 


37 


72 


17 | 


| 135 


TOTALS 


*189 


937 


1807 


1817 


450 


| 5000 



Table 1.4 



/ 

/ 

/ 

/ 

/ 




Fig. 1.4 



6 



DISTRIBUTIONS 



ties of the various alternatives, and 
the whole set of probabilities for all 
the alternatives, which mist add up to 
unity, we call a probability distribu - 
tion . Evidently such a definition can 
only sake sense for those cases in 
whir populations of sufficiently large 
nuni actually exist . The examples 
from physics with which we are chiefly 
concerned are of this kind; the sane is 
not always necessarily so in other con- 
texts. It would be difficult, for exam- 
ple, to attach mich weaning to the 
statement: "The probability that bil- 
lion-doll^r corporations will go bank- 
rupt in 11966 is 0.02%"; but a similar 
assertion about small companies of 
whi^h there are many thousands, could 
be perfectly sensible. 

So long as we are dealing with a 
"sufficiently large" sample of the 
over -all population of interest, we may 
identify it 3 relative frequency distri- 
bution with the probability distribu- 
tion of the population. The work of the 
next chapter will lead us to a crite- 
rion for judging whether a sample is 
"sufficiently large.” But what if the 
sample is not that large? Clearly, dif- 
ferent samples will exhibit somewhat 
different distributions, even though 
they are drawn from the same larger 
population. These variations (in phys- 
ics they are often called "fluctua- 
tions") are of great importance in 
practical applications. An opinion- 
polling organization must know how 
large a sample to question in order 
that the results obtained are reason- 
ably representative of the actual dis- 
tribution of opinion in the whole pop- 
ulation. A manufacturer producing large 
numbers of standardized items needs tc 
know how large a sample uvst be tested 
for conformity to the standard, in or- 
der to have reasonable assurance that 
no defective unit.-* are allowed on the 
market. Here again, the work of the 
next chapter will give us means of es- 
timating these variations for samples 
of given sises. 



1.6 EXPERIMENT AND THEORY 

In most of the examples discussed 
above, we have been thinking of the 
distributions as being found experi- 
mentally . A large number of molecular 
velocities, or neutron-scattering an- 
gles, or men's heights and weights, are 
measured, tabulated, and converted to 
distributions. But not all distribu- 
tions are errerimental ; they may be 
deduced theoretically from physical or 
mathematical assumptions. Frequently 
they require techniques from various 
branches of mathematics - particularly 
probability theory - for their deriva- 
tion. Since much of probability theory 
is concerned with che calculation of 
other probabilities from sets of given 
ones, the basic rules for combining 
probabilities are used frequently in 
th 3 following chapters. Ve will there- 
fore state them here for reference. 

First: if A and B are mutually ex- 
clusive alternatives, the probability 
that either A or B will occur is the 
sum of the probabilities of A and of B 
occurring separately. ( Example : the 
probability that either 1 or 2 heads 
show in a toss of 4 coinf is the sum 
of the probabilities that one head 
shows, and that two heads show.) 

Second: The probability that first 
A and then B occur in successive inde- 
pendent trials is the product of the 
probabilities of A and of B occurring 
separately. ( Example : the probability 
that first two heads show, and then one 
head shows in a second toss of four 
coins, is the product of the probabili- 
ties that one head shows and that two 
show.) 

With the help of these apparently 
simple rules, elaborate superstructures 
of theory can be erected. But it is im- 
portant to remember that, for the phys- 
icist, the ultimate test of the valid- 
ity of theoretical calculation* of the 
type which we will do in the next chap- 
ters is their comparison with experi- 
ment . 
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PROBLEMS 



1.1 Give examples of a distribution 
which wight be of interest in 
(a) psychology, (b) linguistics, 

(c) econoaics. In each case, spec- 
ify the population, and the range 
of values of the characteristic 
whose distribution you envisage. 

1.2 SoaetlBes a cumulative distribution 
is defined by the number greater 
than (instead of less than ) a ser- 
ies of successive equally spaced 
values of the property in question. 
Construct such a cumulntivo dis- 
tribution' for tho molocular veloci- 
ties of Table 1.2. 

1.3 Vhat is the meaning of the last row 
and of the last column (labeled 
totals ) of the Joint distribution 
given in Table 1.4? 

1.4 The median of a distribution is the 
value of the property such that 
half the population lies above and 
half below it in value. Vhat is the 
median for each of the distribu- 
tions in Table 1.1 and Thble 1.2? 



1.5 .--ting the distribution of veloci- 
ties in Table 1.2 to be sufficient- 
ly representative of the whole pop- 
ulation, what is the probability 
that: (a) the velocity of a mole- 
cule lies between 20C and 700 me- 
ter/sec? (b) that three molecules 
chosen independently all have ve- 
locities lying between 300 and 400 
meters/ sec? 

1.6 Imagine an instructor who gives 
only throo marks, high (II), medium 
(M) , and low (L) , on every test , 
and who gives one third of the 
class each mark. When averaging two 
tests, he gives H only to those who 
have II on each, and similarly for 
L. All the rest get K. What frac- 
tion of the class will have aver- 
ages of H, L, and M, respectively, 
on two tests? On three? Using the 
foregoing instance as a guide, ex- 
plain why a student who is never 
first in the class on any given 
test, but who is consistently high 
in standing, often ends the term 
with the best average in the class. 
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2.1 NOTATION; MEAN AND OTHER AVERAGES 

We will need a notation to de- 
scribe the discrete distributions with 
which this chapter will deal • Let u.? 
denote by N the total number in the 
population in question; then the num- 
ber of individuals falling in the kth 
interval of the property whose distri- 
bution is under study we will denote by 
n k . The sum of all the numbers in the 
various groups must be N; using the 
customary notation for summation, we 
have 

£ n k - H, (2.1) 

k 

where the summation extends over all 
the intervals into which the range of 
values has been divided. The corre- 
sponding probabilities ws will denote 
by p^; as remarked in Section 1.6, this 
probability is found by dividing 1 % by 
N: 

p k - n^/N. * (2.2) 

or 

n k " ® Pk» 

from which we obtain 




Thus the probability that an individ- 
ual chosen at random from the popula- 
tion is found to be in the kth inter- 
val is p k ; Eq. (2.3) represents the 
statement that the total probability of 
finding the individual somewhere among 
all the intervals is unity, i.e., it is 
sure to be found in one or another in- 
terval . 

The value of the property under 
study, in the kth interval, will be de- 
noted by an appropriate symbol - dif- 
fering from example to example - with 
subscript "k"; in the example of ¥ 

heights in Table 1.4, we may use h k , 
for velocities of molecules, in Tablo 



1.2 we might use \ , and so on. Notice 
that there is some ambiguity in the 
phrase "the value in the kth interval"; 
there is no unique "value" of the 
height for the interval from 5 ft 6 in. 
to 5 ft 9 in. In examples of this kind 
we will agree to use the value at the 
midpoint of the interval (e.g., 5 ft 
7$ in. for tlf case cited). In certain 
other example b of discrete distribu- 
tions there will be no ambiguity, how- 
ever , because the characteristic 
studied takes on strictly discrete 
(perhaps integer) values. If we ask for 
the distribution of the total obtained 
in a number of throws of a pair of 
dice, for instance, the values are in- 
tegers ranging from 2 to 12; in exam- 
ples of this type, the phrase "value 
in the kth interval" may be replaced 
throughout by "kth value" in the pre- 
ceding discussion. 

An extremely important quantity 
associated with the distribution is the 
mean value or average of the property 
under study. It is defined just as the 
ordinary arithmetic average of common 
usage, as the sum of, the values of the 
property for all individuals of the 
population, divided by the total num- 
ber in the population. Since the indi- 
viduals are grouped in such a way that 
ni have the value h lf n 2 have the value 
h 2 , and so on, the average, which we 
will denote by (h) , is given by 

(h) - (n 4 hj + n 2 h 2 + ...)/N 

- (£ n k h k )/N. (2.4) 

k 

Bearing in mind Eq. (2.2), which de- 
fined the probabilities p k , we may also 
write 

< h > " £ Pk h k- < 2 - 5 > 

k 

This mean value ie the single num- 
ber most often used to characterize a 
distribution - if a single number must 
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be used! In everyday usage this is very 
familiar; one hears references to such 
things as the average income in a state 
or country, the average life expectancy 
of a 2 5 -year -old female, or the aver- 
age number of years of schooling of 
this group or that. In each of these 
cases, the "average" is being used as a 
single number to characterize a whole 
body of information which really con- 
stitutes a distribution. The actual in- 
comes may vary from 0 to $100,000, the 
life expectancies from 1 to 75 years, 
and the years of education from 0 to 
20; but the average - which is the sin- 
gle value which all the individuals 
would need to have, in order to give 
the same total value for the whole pop- 
ulation that in fact exists - is used 
as a quick summary. 

It is however by no means the only 
mean or average which may be needed. 
There are many other averages which may 
be defined, and which may be more ap- 
propriate for some particular use. The 
molecules of a gas have a distribution 
of speeds (the "Maxwellian" distribu- 
tion, which we will study later) ; and 
the average or mean speed defined as in 
Eqs. (2.4) or (2.5): 

<v> “2** V * “ V k> /N 

k k 

is often useful. But the kinetic en- 
ergy of a molecule, according to clas- 
sical mechanics, is proportional to the 
square of the velocity; hence if we 
need (as we will in the kinetic theory 
of gases) Information about the aver- 
age, or mean, kinetic energy of the gas 
of molecules, we must find the average 
of the square of the velocity : 



<v*> - £p k vi - (£n k vJ)/N. (2.6) 
k k 

In a similar fashion, other functions 
of the velocity which occur In other 
contexts nay need to be averaged over 
the distribution of velocities, and we 
will use a similar notation in each 
case: 

<*<v)> - 2 p k £(Vk) 

k 

- (£n k f(v k ))/N. (2.7) 

k 

In particular, the averages of the 
powers of the quantity whose distribu- 
tion is being studied, which are known 
as the moments of the distribution, are 
of Interest ; not Infrequently in phys- 
ics, various moments of a distribution 
may be accessible to direct measure- 
ment, even though the distribution as 
a whole is not . From a knowledge of the 
moments one can reconstruct - approxi- 
mately - the distribution itself, or at 
least confirm whether or not some the- 
oretically predicted distribution 
yields the observed moments. 

2.2 WIDTH OF A DISTRIBUTION; STANDARD 
DEVIATION 

If the mean value is the single 
number most often used to characterize 
a distribution, the number second in 
im portance after the mean is one which 
measures the width of the distribution, 
or the spread of values about the mean . 
Clearly, many widely different distri- 
butions can have the same mean; Figs. 
2.1, 2.2, and 2.3 give examples of pos- 
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sible distributions of a quantity q 
whose values range from q a i n to q*ax • 

In each case the mean value (q) is the 
same, but the "shapes” of the distribu- 
tions are quite different; in Fig. 2.1, 
all values between q a in and q M x are 
nearly equally likely, in Fig. 2.2, 
only values near the mean are likely, 
and Fig. 2.3 is intermediate between 
the extremes represented by the first 
two. 

How is the "width" to be measured? 
Evidently what is needed is some esti- 
mate of the likelihood of various devi- 
ations from the mean value; large devi- 
ations are likely in Fig. 2.1, unlikely 
in Fig. 2.2, and moderately likely in 
Fig. 2.3. One's first thought might 
reasonably be to take the average of 
the deviations from the mean. Denoting 
Dy Aik the difference between a partic- 
ular value q k and the mean (q' 

Aq k - (q k .“ (q>). ( 2 . 8 ) 

we can calculate its mean in the 
standard way : 

<Aq> - <<q-<q>» - [2>*<«*-< < 0>] /N 
" £ Pk (Qk “ <<I> > 5 <2.9) 

k 

but a little reflection shows that 
this quantity is necessarily zero. It 
is customary therefore to take instead 
the mean of the squared deviation from 
the mean: 

<<*!>*> - <(q - <<l>) l > 

-E»k (Qk - <q>> 2 ]/»- W- 1 ®) 

This quantity is called the variance 
of the distribution, and its square 
root (that is, the "root mean square" 
deviation) is called the standard de - 
viation of the distribution. It is 
often denoted by o (lower case Greek 
"sigma") : 

o - /<(q - <q})*> ■ VS p, (q k - <q) ) 2 
- Vg p k <Mk>'- «•“> 



A useful alternative expression for 0 
may be found by expanding the square 
deviation: 

a 2 - ^Pk (Aq k ) 2 

k 

- J^Pk (<lk 2 “ 2(q)q k + <q) 2 ) 

k 

- <q*> - 2(q)(q) + <q) 2 

- (q 2 ) - <q> 2 . (2.12) 

The last steps have made use of Eqs. 
(2.3), (2.5), and (2.6). In words, one 
may say that the variance is the dif- 
ference between the mean square of the 
quantity and the square of the mean of 
the quantity. Since, in the language 
used in the last part of the previous 
section, the mean is the "first moment" 
and the mean square ^.s the "second mo- 
ment" of the distribution, one may say 
that knowledge of value of the first 
moment specifies the mean, and of the 
second moment the variance. Higher mo- 
ments would give successively more de- 
tailed features of the distribution. 

For example, the third moment will give 
some indication of whether positive or 
negative deviations from the mean pre- 
dominate; the variance gives no clue to 
this, since it involves the square of 
the deviation, to which positive and 
negative deviations contribute equally. 

2.3 THE BINOMIAL DISTRIBUTION 

A distribution which arises natu- 
rally in a variety of physical prob- 
lems is the binomial distribution. It 
arises whenever a choice of two altern- 
atives is available, and the choice is 
made many times. For example, in a. 
"random walk" problem, which is a model 
for Brownian motion or for molecular 
• diffusion, one imagines a particle mov- 
ing to the right or left along a line 
in steps of equal size, if each step is 
equally likely to be to the left or to 
the right, what is the likelihood that 
the particle will have moved a certain 
net distance to the right after a given 
number of steps? Or again, under cer- 
tain conditions, an atom with a mag- 
netic moment, in a magnetic field may, 
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according to quantum theory, align it- 
self in two ways only: along oi against 
the field. In a large collection of 
such atoms, what is the likelihood of 
finding a given number aligned with or 
against the field? An even simpler 
question which we will adopt for illus- 
trating the binomial distribution, is 
the following: In a box of gas, how 
many of the molecules will be at any 
instant in the ri^t half of the box? 
The question may sound trivial at 
first; surely half the molecules are 
in each half of the box - at least "on 
the average"! But how do we know that 
this is so? Or even if it is so, what 
does "on the average" mean? What are 
the chances of finding a few more or a 
few less than half of the molecules in 
the right half? Does it matter whether 
the gas is at normal pressure or ex- 
tremely rarefied? 

To consider this problem, suppose 
the box contains M molecules; let j be 
in the left half and k in the right 
half. Then we must have 

J + k - M. (2.13) 

The possible values of k range from 0 
to II; we want the distribution of k 
over these values. To illustrate the 
method, consider first the case of 
three molecules, although our interest 
is really in large numbers. In this 
case we can simply enumerate the possi- 
ble ways of assigning molecules to the 
two halves respectively. In Fig. 2.4, 
the possible assignments are sketched, 
and the rushers j and k for each as- 
signment .»re listed, together with the 
number n k of assignments for which k 
molecules are in the right half, and 
the probability p k of finding k mole- 
cules in the right half. 

Figure 2.4 shows that there are a 
total of eight possible assignments. 
This is understandable, since there are 
two possibilities for each molecule, 
hence 2x2x2 for all three. Of these 
eight assignments, one corresponds to 
no molecules in the right half, three 
correspond to one molecule, three to 
two molecules, end one to three . The 
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Fig. 2.4 

corresponding probabilities are respec- 
tively 1/8, 3/8, 3/8, and 1/8, as indi- 
cated in the last column of the figure. 
Thus, in a very large number of boxes 
containing three molecules, we expect 
to find that 1/8 of them contain no 
molecules in the right half , 3/8 of 
them contain one, 3/8 of them contain 
two, and 1/8 of them contain three. 
Alternatively, we may say that these 
fractions give the number of times, in 
a sequence of a large number of inde- 
pendent looks at the same box, that one 
will see the specified number of mole- 
cules in the right half. From these 
probabilities we may calculate the mean 
number, and the various other averages 
which might be of interest, in accord- 
ance with the formulas of sections 2.1 
and 2.2. Instead of doing this for the 
special case of three molecules, how- 
ever, let us first go on to the general 
case of M molecules. 

In the case of M molecules, there 
will be a total of 2 M possible assign- 
ments of each of the If molecules to the 
right or the left half. How many of 
these correspond to precisely k mole- 
cules in the right half? Only one as- 
signment gives k - 0: every molecule 
on the left. But M assignments give 
k • 1; the single molecule on the right 
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may be chosen in M different ways. For 
k - 2, the first of the two molecules 
on the right may be chosen in M ways, 
and the second in (M — 1) ways; this 
gives a total of M(M — 1) ways. But no- 
tice that this counts as distinct as- 
signments those in which the same two 
molecules are placed on the right, but 
in reversed order. Since a single as- 
signment is specified by saying, for 
example, "put molecules #5 and #17 on 
the right" without taking into account 
the order in which #5 and #17 are se- 
lected, we must correct for the spuri- 
ous doubling of the number of assign- 
ments which our method of counting 
gives. Taking this into account gives 
finally the result M(M — l)/2 for the 
number of distinct assignments of two 
molecules to the right hair. For k - 3, 
the same line of reasoning gives the 
result M(M - 1)(M - 2)/6; here the fac- 
tor in the numerator gives the number 
of ways of picking three molecules in 
a particular order, and the factor 6 
in the denominator corrects for the 
number of permutations of the three 
molecules selected to be in the right 
half. For general k, the result is 

n k - M(M-l)(M-2) (M - k + l)/k!, 

(2.14) 

where again the denominator contains 
the factor k! (to be read "k factor- 
ial," the product of the first k inte- 
gers) , which is the number of permuta- 
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tions of k objects, and which corrects 
for the over counting of the assign- 
ments which the numerator effects. A 
somewhat more symmetrical form can be 
found by multiplying numerator and de- 
nominator both by (M — k)i, the product 
of the integers from 1 up through 
(M - k) : 



M! 

" ki (M - k)! * 



(2.15) 



Bearing in mind Eq. (2.13), we may 
also write: 

n k - . <k + 4 - M). (2.16) 



In this form, the symmetry between 
right and left halves becomes apparent; 
interchanging k and j leaves the ex- 
pression unchanged, as it should. One 
can now easily verify that the expres- 
sions we wrote down earlier for n 4 , n 2 , 
and n 3 agree with the general expres- 
sion. For n 0 , , which we saw earlier has 
the value 1 , the general expressions in 
Eqs. (2.15) or (2.16) are valid, pro- 
vided we adopt the customary convention 
that 0: - 1. 

The name "Binomial Distribution" 
for our result arise^ from the fact 
that the numbers n k defined by Eq. 
(2.16) are precisely the same numbers 
that occur as the coefficients in the . 
binomial expansion of mathematics: 

(a + b)" - £ n, a»-‘b‘. (2.17) 

k=0 

That this is so is understandable when 
one reflects that the kth' coefficient 
in the expansion counts the number of 
ways of picking out. v bs and (M — k) 
a's from the set of M factors (a + b) 
that art implied by (a + b)“ . Formally, 
this is identical with our problem of 
picking out k molecules to put in the 
right half and (M — k) to put in the 
left half, from the total of M mole- 
cules. We can use Eq. (2.17) to verify 
that the total number of assignments is 
2 h , ss we asserted earlier; for if we 
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let a - b - 1 in Kq. (2.17), we get 

(1 + 1)‘ - £ n t <l> a ‘*(l) k , 

k = o 
or 



2> “ 2 n k “ 2 k! (M - k): ** 2 



.18) 



k- o k=o 



In Table 2.1, the binomial coeffi- 
cients are listed for values of M from 
1 to 8; for each M, the values of k run 
from 0 through M. The sum of the coef- 
ficients is also listed for each M. No- 
tice that the table can be very easily 
continued; given the set of n k for a 
particular M, the next row in the ta- 
ble, corresponding to M + 1, is given 
by the rule : 

n k (M + 1 row) - n k + n k _ x (M row) 

(2.19) 



The corresponding probabilities are 
found by dividing the n k by the total 
number of assignments 2 M ; hence 



M! 

Pk " 2" k! (M - k)! 



( 2 . 20 ) 



We can now examine the distribution 
and use it to answer some of the ques- 
tions raised initially. A glance at 
Table 2.1 shows that, for all values of 
M occurring there, the most probable 
value of k (the one for which n k , and 
hence also p k is largest) is always 
M / 2, if M is even, and that if M is 
odd, the two integers nearest to 
M/2 are equally the most probable. 

It is not hard to show that this 
renains true for any M. Thus one 
is indeed more likely to find just 
one half of the molecules in the 
right (or the left) half than any 
other particular value. What about 
the average, or mean number? From 
the definition, it is given by 



, v v 1 1 T *m: 

(k> - *p k " 2? i* k: (M - k) ! * 



k*o 



k>t 



( 2 . 21 ) 



/ 

By actual calculation, using the first 
few values of M and Table 2.1, one 
finds the value M/2 in each instance; 
that it is true in general follows 
from Eq. (2.21) by rewriting it slight- 
ly differently, and using Eq. (2.18), 
with (M — 1) substituted for M: 

<*> 

1 k=i 

m(m - 1 ): 

(k - 1 ) ! [ (M - 1) - (k - 1)J J 

M_ yi 1 (M - D! 

"2" q! (■ - 1 - q>! 

- ^ • 2“~ l - M/2. (2.22) 

2 " 

Hence the average number found in 
either half of the box, in many trials, 
will also be just half of the total 
number of molecules. 

Finally, the likelihood of devia- 
tions from the mean value can be exam- 
ined; for this purpose we need, as 
shown in section (2.2), the mean of the 
square of k. Before examining this 
quantity mathematically, let us see 
qualitatively what to expect, by plot- 
ting the distribution of k for various 
values of M. Fig. 2.5 (next page) 
shows bar graphs of n k against k for 
M - 8, 40, 200, and 1,000. For ease of 
comparison they are drawn with the same 
ordinate at the maximum, and with the 
same range of the abscissa correspond- 
ing to the full scale from 0 to M in 
each case. For M»200 and M“1000, not 
every value of k has its n k drawn in 
because of ti'e smallness of the scale. 

Notice that the graphs of the dis- 
tributions become narrower as M in- 
creases. Since they are drawn to the 
same relative scale, this means that 
the probability of a deviation from the 
mean value M/2 by any given fraction of 
M becomes smaller as M increases. Thus 
although the probability of finding 
3 or 5 molecules out of 8 in either 
half of the box is not very much less 
than the probability of finding 4, the 
probability of finding 15 or 25 out of 
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40 is substantially less than that of 
finding 20. For M - 200, the corre- 
sponding numbers, 75 and 125, have ex- 
tremely small probability compared with 
100, and for M - 1000, the correspond- 
ing probabilities are entirely negligi- 
ble. 

We can confirm this qualitative 
conclusion by calculating the standard 
deviation, using the definition of sec- 
tion 2.2. According to Eq. (2.12), we 
must first find (k 2 ) : 
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<k*> - E k J P„ 
kso 

y k a Ml 

“ 2“ k! (M — k) ! * 

A trick which is useful, because of the 
occurrence of the factorial k in the 
denominator, is to rewrite k 2 as (k 2 
- k + k), or (k (k - 1) + k); then we 
have: 




k — > 




<k*> - <k(k - 1) + k) - <k(k - 1)> + <k) 

A k(k - 1) *: . 

- L 2‘ k! (M-k): + (k) 

ksi 



m(m - 1 ) y 
2 “ ^ 



k=a 



(M - 2)J_ 



(k - 2) ! [ (II - 2) - (k - 2)] 



+M/2 



. M(M 2 ~ 11 . 2 <"- 2) + M/2 

- (M 2 - M)/4 + M/2 
" (IS 2 t M)/4. (2.23) 



Hence the standard deviation, according 
to Eq. (2.12), is qbtained from 

a 2 - <k 2 ) - <k) 2 

- (M 2 + M)/4 - (M/2) 2 

- M/4, (2.24) 




Fit. 2.5 



and its ratio to the mean value (k),is: 
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0/(k> - (M/4) l/, /(M/2) - 1/M 1/3 , 

( 2 . 25 ) 

That is, the width of the distribution, 
although it increases in proportion to 
Mi/* in absolute size, d ecreases in 
proportion to l/M l/a relative to the 
average number, in agreement with our 
qualitative result above based on the 
appearance of the graphs of Fig. 2.5. 
Thus a few cubic centimeters of gas, 
containing, say, 2 x 10 2 ® molecules, 
will have "on the average" 10 20 mole- 
cules in each half; this number may 
fluctuate by about 10 l ° from its mean 
value. Although 10 l ° is a very large 
number, it is a minute fraction of 
10 2 ° : one part in 10 l ° ! It would be 
extremely hard ever to observe a devi- 
ation from the average so small as 
this. If however we imagine reducing 
the pressure until there are only 200 
molecules in the same space, the aver- 
age number of 100 i n eac h half could 
fluctuate by about V100, or 10; this 
is 10%, or a sizable fraction of the 
average . 



2.4 THE ASYMMETRIC BINOMIAL DISTRI- 
BUTION 

An assumption was hidden in the 
work of the previous section: plausible, 
but nevertheless an assumption. The 
eight alternatives listed explicitly in 
Fig. 2.4, for the assignment of three 
molecules to the two halves of the box, 
were regarded as "equally likely" to 
occur. But suppose the imaginary parti- 
tion dividing the box into two parts is 
moved to the left, so that the left and 
right sections contain one-third and 
two-thirds of the total volume, respec- 
tively. The enumeration of assignments 
of Fig. 2.4 is still correct; but we 
feel it absurd to regard them all as 
"equally likely." How should we. weight 
the various assignments? 

It seems intuitively plausible to 
regard a single molecule as twice as 
likely to be found in the right side as 
the left under these circumstances, 



since its volume is twice as great . 
Stated otherwise, the probability of 
finding a single molecule on the left 
is one-third, and on the right is two- 
thirds. Then the probability for find- 
ing k molecules on the right and 
(M - k) on the left is given by the 
number of assignments found previously, 
but multiplied by a factor (2/3) k 
X (1/3)“'* k , since probabilities of in- 
dependent events are multiplied to find 
the probability of simultaneous occur- 
rence. The result is therefore: 



M! 



P k " k! (M — k) ! 



(2/3) k (1/3) 



M-k 



(2.26) 



We can obtain the same result in a 
slightly different way. If a single 
molecule is twice as likely to be found 
on the right as on the left, a pair of 
molecules is four times as likely to be 
found on the right as on the left, and 
a triplet is eight times as likely. 
Hence the set of n k 's of Fig. 2.4 
should be multiplied by 1 for k • 0, by 
2 for k - 1, by 4 for k - 2, and by 8 
for k - 3 to give the properly weighted 
assignments, as shown in Table 2.2 be- 
low. Notice that the total number of 
assignments with the new weights is 27, 
which is 3 3 ; dividing by this number, 
we obtain the set of probabilities p k 
given in the last column. But these 
agree exactly with the result of Eq. 
(2.26), when M is set equal to 3. 

More generally, if we divide the 
volume into two parts V L and V B , we 
may take the probability of finding a 
single molecule on the right to be 



k 


OLD n 


WEIGHT 


NEW n 


NEW p 


0 


1 


1 


1 


1/27 


1 


3 


2 


6 


6/27 


2 


3 


4 


12 


12/27 


3 


1 


6 


6 


8/27 








27 


1 
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given bv p - V R /(V L + V R ), and the 
probability of finding it on the left 
to be given by q - Vl/(Vl + Vn). Then 
the same argument used in arriving at 
Eq. (2.26) leads to the following ex- 
pression for finding k molecules out 
of a total of M to be in the volume V R : 



M! 



P * “ k! (M - k)! 



p* q“-k 



(2.27) 



From the binomial expansion, Eq. (2.17), 
we see that the sum of the p k 's is 
unity, as it should be: 

A ^ M! v a-k 

L> p k Zj k ! (M — k) ! P q 
k*o k=0 

- (p + q) M - l“ - 1. 

The case discussed in section (2.3) is 
then the symmetric case p ■ q • £. 

The general asymmetric binomial 
distribution defined by Eq. (2.27) ap- 
plios whonovor one makes M independent 
repetitions of a choice between two mu- 
tually exclusive alternatives whose 
probabilities are p and q, with p + q 
- 1. Other examples, in addition to the 
one used above, can easily be con- 
structed from the kinds of physical 
problems discussed at the beginning of 
section 2.3. The biased random walk, in 
which the particle moves to the right 
with probability p, and to the left 
with probability q at each step, is 
such an instance. 

Using the same methods as in the 
previous section, we can calculate the 
mean value of k and its standard devi- 
ation; the reader should carry through 
the steps and convince himself of the 
result : 

<k> - £ k Pk ; 

k=0 

-pH, (2.28) 

a 2 - (k 2 ) - <k> 2 

- p(l - p) M. (2.29) 



Hence o/( k) - Vp(l P jH 

p M 




(2.30) 



Again the characteristic inverse square 
root of the number of molecules ap- 
pears. Notice that af p is much less 
than unity - if , for example , we are 
studying the probability of finding k 
molecules in a very small volume of the 
original box - then a is a pproximately 
equal to Vp M, or to V(k ) , and the ra- 
tio o/( k) is then 1/ Vp M, or 1/V(k) ; the 
expected fluctuations from the mean 
number are of the order of the square 
root of the mean number itself. 



2.5 THE MULTINOMIAL DISTRIBUTION 

The binomial distribution occurs 
whenever a choice of two alternatives 
is mado ropoatcdly. But often there are 
more than two alternatives! We can 
imagine dividing our box of gas into 
three, or ten, or a million parts in- 
stead of two; a random walk can take 
place on a plane or in space instead 
of on a line, with a number of differ- 
ent choices of steps possible each time 
rather than simply "right" or "left"; 
a magnetic atom may have several ori- 
entations possible relative to an ex- 
ternal magnetic field instead of simply 
"along" or "against." How are we to 
handle these cases? 

We may state the problem this way: 
M molecules are to be distributed among 
N cells of equal volume into which the 
box has been divided in imagination. 

How many of the possible assignments of 
individual molecules to cells corre- 
spond to having k t molecules in cell 1, 
k 2 in cell 2, and so on up to k N in 
cell N, in such a way that k! + k 2 + 

... + k„ = M? Let us first notice that 
our previous work has shown that for 
N « 2, the number in question is 
M!/(k 1 !)(k 2 !), using the new notation. 
We can understand this result by 
slightly different reasoning than we 
originally used to obtain it. The num- 
ber of different assignments would be 



M! if we had only one molscule in each 
cell, sine** MI is vhe number of permu- 
tations of M objects. But a permutation 
of the k x molecules in the first cell 
among themselves, when ki is greater 
than one, does not give a distinct as- 
signment, so we must divide by k & !; a 
similar argument for the other cell re- 
quires us to divide by k 2 !. Now the re- 
quired generalization to N cells is 
easy to see: We must divide by the fac- 
torial of each individual cell's popu- 
lation. Hence the result for the re- 
quired number of assignments, which we 
shall call n(k lf k 2 , ... k B ) is: 

. m: 

n(k lf k 2 , ... k B ) - j^, ... k„!’ 

(2.31) 



The total number of assignments is N , 
since there are N possibilities for 
each of the M molecules; hence the 
probability of the given assignment is 

p(k lf k 2 , . . . k B ) 



IT 

“ n m k l : k 2 : 




(2.32) 



This distribution, which is a JoiP* 
distribution of the kind discussed in 
section 1.4 in N variables, is known 
as the multinomial distribution . As 
with the binomial distribution, this 
i»i> originates from the fact that the 
n(k 4 , k 2 , ... k B ) are the coefficients 
in a certain expansion: 



(a x + a 2 + ... ®>) 



- £ n(*i» k 2» 

Cki) 



. .. k,)(»,)**(»2) ks 
(»«)'*. < 2 - 33) 



where the sum is over all sets of non- 
aegative integers k lt k 2 , ... k B satis- 
fying the condition k x + k 2 + ••• k * 

— M. This is the general multinomial 
expansion; that the coefficients are 
indeed given by Eq. (2.31), follows on 
observing that a given coefficient 
counts the number of distinct ways of 



picking out k 4 factors a &f k 2 factors 
a 2 , ... k* factors a„ from M factors 
(a t + a 2 + ... »*)• This is exactly the 
same counting problem as that of pick- 
ing out k t molecules to put on the 
first cell, etc., out of a total of M 
molecules. We can use this to confirm 
that the total number of assignments 
is N m ; Indeed, setting each of the a A 
in Eq. (2.33) equal to unity, we get: 

(1 + 1 + ... 1) M • 21 n ^ k i» k 2’ 

N terms Ui> 

... k.)(i) k * dy* • • • <D k * 

or 

- £ »< k »* k »» ••• k « ): (2 - 34) 
«*i> 



(ki 




PERMUTA- 

TIONS 


PRODUCT 


(5.0.0) 


i 


3 


3 


(4.1.0) 


5 


6 


30 


(3.2.0) 


10 


6 


60 


(3.1.1) 


20 


3 


60 


(2 AD 


30 


3 


90 
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hence the probabilities add correctly 
to unity: 



^ p(ti > k 2 , ••* k«) 
Cki> 



- L 

“i> 



n(k 4 , 



k„ 




1_ 

N" 



N" - 1. 



(2 .35) 



This distribution is harder to visual- 
ize than the binomial . For N ■ 3, how— • 
ever, where k 4 and k 2 »re essentially 
the only variables involved (since k a 
is necessarily equal to (M — k 4 k 2 ), 
we can construct a three-dimensional 
graph analogous to Fig. 1.4 of Chap- 
ter 1. We have done this in Fig. 2.6, 
(preceding page) which exhibits the 
frequencies n(k 1 ,k 2 ,k 3 ) for the case 
M - 5. Table 2.3 (preceding page) 
gives the values of n for the possible 
assignments . 

From Fig. 2.6, one can see a ten- 
dency, even though the number of mole- 
cules is small, for the assignments 
corresponding to approximately equal 
distribution of the molecules among the 
cells to predominate. It is not hard to 
demonstrate that the uniform (i.e., 
equal population of cells) distribution 
is indeed the most probable one, at 
least in the case where the number M 
of molecules is an integer multiple of 
the number N of cells. For if M - rN, 
where r is an integer, the uniform dis- 
tribution is the one for which k 4 ■ k 2 
- ... - - r; the corresponding n is 

(M*. )/ (r ! )* . Suppose we move one mole- 
cule out of one cell and into another; 
the corresponding n is now (M!)/(r— 1)! 
X (r + 1) ! (r !)■“*. The ratio of the 
second to the first is (r!)*/(r 1)1 

x (r + 1)!, which equals r/(r + 1); 
hence the change has decreased the num- 
ber of assignments, and thus also the 
probability of occurrence. 

We can calculate the mean value of 
the ki's, and also other averages of 
interest; but for this purpose we must 
first note the generalization of the 
definition of an average which in re- 
quired for the case of a Joint distri- 



bution. The average of any function of 
the ki's is defined in analogy to Eq. 
(2.7) of section 2.1: 

(f (k 4 , k a , ... kg)) 

— *(k lf k a , ... kg) 

H <k t ) 

n(k| , k 2 , ... kg) 

- y* f (k 4 , k 2 , ... kg ) 

Ui> 

p(k 4 , k 2 , ... kg), (2.36) 



where the sum is over all values of 
each k| from 0 to M, such that k 4 + k 2 
+ ... + kg - M. In particular, the mean 
value of any of the k 4 can be calcu- 
lated for the multinomial distribution 
given by Eq. (2.31) or (2.32): 



(k > - — T 

N« k 4 !k 2 !... kg'. 

M_ y, (M - 1): 

N* — < (k 4 “1)1 k 2 ! ••• kg! 

(ki> 

and using Eq. (2.34) with (M - 1) in 
place of M, we obtain: 



< k i) “ n" * N N 



(2 .37) 



As we expect, the mean value (k 4 ) is 
simply the number of molecules per 
cell; an identical calculation holds 
for each of the k's, from the symmetry 
of the multinomial distribution in the 
k's. Hence we have for each 1: 

<k 4 > - M/N. (2.38) 

Similarly, we can find the standard 
deviation for each k^; the calculation 
is like that for the binomial distri- 
bution, and yields, for each 1: 

at* - (ki*) - <ki>* 

- (H - 1) M/H*. (2.39) 



•Notice that this agrees with the re- 
sult (Eq. (2.24)) for the case N - 2. 
If on the other hand ws allow N to be 
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■uch greater than unity, this result 
becoaes approximately : 

Ot* - M/H - (k t >, (2.40) 

or a t - VS7 n - V(kJ, 
and a 1 /<k 1 > - 1/Vm7n - l/JWl T . 

(2.41) 

Thus the relative fluctuations in num- 
ber in a given cell are again inversely 
proportional to the square root of the 
mean number in that cell. With one cu- 
bic centimeter of gas containing 10 2 ° 
molecules, for example, the number of 
molecules in each cubic micron of the 
volume (one micron ■ 10“ 4 cm) is 10* on 
the average ; this would exhibit fluctu- 
ations of the order of 10 4 molecules, 
which is only 0.01% of the average 
number . 

Again, here, as with the symmet- 
ric binomial distribution in section 
(2.3), we have implicitly assumed, in 
giving equal weights to all the indi- 
vidual assignments of molecules to 
cells, that a single molecule is 
"equally likoly" to be found in each 
cell. This is surely plausible when the 
volumes of the cells are equal ; but 
what if they are not? Or, in the ran- 
dom walk problem, what if steps in dif- 
ferent directions have different prob- 
abilities? The generalization required 
to handle these problems follows along 
the same lines as given in section 2.4. 
Suppose the volume of the ith cell is 
Vt and the total volume is V; then the 
probability p t of finding a single mol- 
ecule in the ith cell is p* - V t /V; the 
sum of the pi's is 1, since the sum of 
the Vi's is V. The probability of find- 
ing kj specified molecules in V. , k 2 
in V 2 » and so on, is (p 1 ) ki (p 2 )“ 2 ••• 
(p B ) k »; this must be multiplied by the 
number of assignments, from Eq. (2.31), 
that give the required distribution 
among the cells, to yield the result: 

p(ki . k, . • • • k » ) “ k, | it, | . . . k.l 

X (P,) k *(p,) k * ... (P,) k *. (2.42) 



/ 

The multinomial theorem, Eq. (2,33) as- 
sures us that 

) S p(kj, k 2 , k M ) “ 1, 

(k t > 

and the previous case, given by Eq. 
(2.32) is recovered when all the cell 
volumes and hence all the probabilities 
are equal: Pi“P 2 " ••• “Pm" l/N. 

The discussion of the mean values 
and the standard deviations of the ki's 
follows very much as before; carrying 
out the details is left as one of the 
problems at the end of the chapter. The 
results are: 



<*1> - Pi" 


(2.43) 


tl ! - Pi (1 - P 4 )P 


(2.44) 


.,/<*!> - V(1 - PjVPjll 


(2.45) 


These results clearly reduce 


to the 



previous ones (Eqs. (2.38) and (2.39)), 
when pi - l/N. We see that the mean 
fraction of the molecules in the ith 
cell is just p lf which is simply the 
fraction of the total volume in the 
ith cell, and that again the fractional 
deviation from the mean is inversely 
proportional to the square root of the 
mean number for each cell . 

Let us summarize in more general 
language the essential result of this 
section. Suppose N alternative outcomes 
of an event are possible, and they have 
probabilities p p p 2 , ... p H . Then if a 
sequence of M independent trials are 
carried out, the probability t’*it out- 
come 1 occurs kj times, 2 occurs k 2 
times, and so on, is given by the mul- 
tinomial distribution, Eq. (2.42). In 
the limit as M becomes larger and larg- 
er, the fraction k 4 /ll is increasingly 
likely to be found very near to p x ; 
that is , its mean is p 4 , and its stand- 
ard deviation decreases as M increases. 
Indeed, it is precisely this behavior 
of repeated trials which allows us to 
identify the probability p A with the 
frequency of occurrence of the ith al- 
ternative in repeated trials as dis- 
cussed in section 1.6, and which there- 
fore renders consistent our very use of 
the term. 
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PROBLEMS 



2.1 Work out the distribution for the 
total shoving face up when a pair 
of dice are throvn y assuming both 
the dice and the throw to be unbi- 
ased (i.e. y each of the six faces 
equally likely to turn up). What 
is the most probable result? The 
sean result? The standard devia- 
tion? 

2.2 Use the binoaial distribution to 
discuss the random walk on a line. 
Starting at x - 0, a particle moves 
unit distance either to the right 
or the left with equal probability. 
(Imagine tossing a coin each move 
to decide which way - heads, go 
right, tails, go left.) Out of a 
total number N of moves, it takes 

L to the left and R to the right; 

L + R - N. The net distance trav- 
eled to the right is evidently D - 
R — L. What is the distribution of 
D? Find the most probable, and the 
mean values of D. What is the root 
mean square of D? 

2.3 Coin-tossing sequences of heads and 
tails may be discussed with the 
help of the binomial distribution. 
What odds should you be willing to 
offer against tossing exactly 6 
heads in 20 tosses? 

2.4 Toss a coin 60 times, and keep a 
record of the results. 

(a) Does the total number of heads 
in the 60 throws lie within the 
theoretical standard deviation of 
the expected mean number? 

(b) Divide the results into 20 se- 
quences of 3 tosses, and find the 
distribution among the four altern- 
atives (0,1, 2, 3, heads). Does it 
lie "reasonably" close to the the- 
oretical binomial distribution? 



2.5 For the case of three molecules in 
a box, work out the results of Ta- 
ble 2.3 in the following way: Imag- 
ine first that the box is divided 
into throo equal parts, and list 
explicitly (as in Fig. 2.4) all of 
the possible ways of assigning the 
three molecules to the three parts. 
Then imagine one of the partitions 
eliminated, so that there are only 
two parts left, one double the size 
of the other. Count the number of 
assignments corresponding to 0, 1, 

2, 3, molecules, respectively, in 
the larger part . 

2.6 Consider a random walk in the plane 
along a square network with equal 
possibilities of moving right, 
left, up, or down at each step. Let 
M steps be taken, of which k lf k 2 , 
k 3 and k 4 are respectively to the 
right, left, up, and down. 

(a) Express the distance D from the 
starting point in terms of the k i 's. 

(b) Find the mean square distance 
moved in M steps, (D 2 ) . 

(Hint: The k^ are distributed ac- 
cording to a multinomial distribu- 
tion with N - 4. You will need to 
work out averages of the form 
(k 1 k 2 ) as well as those done in the 
text .) 

2.7 Given a Joint distribution of two 
or more quantities k 4 , k 2 , ... , 
the correlation coefficient r i j is 
defined by the relation 

r ij“ “ <*i><kj))/aiaj. 

Work out the value of for the 
multinomial distribution. To what 
does it reduce for the case H - 2? 
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3.1 INTRODUCTION: MEAN VALUES; 

EXAMPLES 

The probability distributions 
dealt with in Chapter 2 were discrete , 
that is, the possible alternatives 
could be characterized by an integer k 
(or a set of integers k]^), which ranged 
over a finite number of values. Very 
often we need to deal with what we will 
call continuous distributions, where 
the possible alternatives are charac- 
terized by one or more variables which 
range over a continuum like that of the 
real numbers. Some examples of continu- 
ously distributed quantities which 
arise in physics, together with the 
range of possible values of the quan- 
tity in question, are: 

(1) The distance between succes- 
sive collisions of a molecule in a gas 
- the so-called ’’free path”; any posi- 
tive value. 

(2) A velocity component of a mol- 
ecule in a gas; any positive or nega- 
tive value. 

(3) The angle through which a nu- 
clear particle is scattered in a col- 
lision; any value between 0 and ir. 

(4) The time of decay of a radio- 
active nucleus; any positive value. 

Of course, in any of these examples, 
we may — as in the example of the ve- 
locities in Chapter 1 - divide the 
range into a finite number of inter- 
vals, and treat the distribution as 
discrete. Indeed, the limitation^ of 
accuracy of the measuring instrument , 




and the finitencss of the population 
may require us to do so. Nevertheless, 
we may imagine both the accuracy of 
measurement and the size of the popu- 
lation increased sufficiently to allow 
the Intervals to be decreased indefi- 
nitely; in the limit we can speak about 
the probability corresponding to an 
arbitrarily small interval. 

It is simplest to begin with the 
notion of the cumulative distribution 
function, —hich was discussed in sec- 
tion 1.3. Let us denote by P(x) the 
probability that the quantity whose 
distribution is under study is less 
than x. Then P(x) can never decrease 
as x increases; for if x' is greater 
than x, P(x') must be equal to P(x), 
the probability that the quantity is 
less than x, plus the probability that 
it lies between x and x’; since proba- 
bilities cannot be negative, P(x’) can- 
not be less than P(x). The general ap- 
pearance of possible P’s is illustrated 
in Figs. 3.1 and 3.2. The first shows 
a distribution which has a minimum and 
a max imum possible value for x, while 
the second shows a distribution which 
' extends indefinitely to large negative 
and positive values. Ultimately, P(x) 
must approach 0 at the left-hand end 
and 1 at the right-hand end of the 
graph. The probability of finding a 
value of x lying between two specified 
values x A and x 2 is then given (pro- 
vided x A is less than x 2 ) by [P(x 2 ) 

— P(x A )J. If P(x) is a continuous func- 
tion, then [P(x 2 ) - P(x A )] will ap- 
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preach zero as x 2 approaches x ( ; and 
if furthermore, P(x) is differentiable, 
then for (x 2 ” x t ) sufficiently small, 
ve may write by Taylor's expansion: 

P(xg) - P(Xj ) + (x 2 - x 4 ) (dP/dx) x=Jti , 

neglecting higher terms of the expan- 
sion , 

Hence 

P(x 2 ) - P(Xj ) - (x 2 - x A ) p(x A ), (3.1) 

where p(x) - dP/dx. (3.2) 

Thus the probability that x lies in a 
small interval around x 4 is given by 
the product of the interval with p(x t ) ; 
hence the name probability density 
function which is sometimes used for 
the derivative p(x). It can also be 
called the probability "per unit in- 
terval of x." Dimensionally it has the 
units of the reciprocal of x, since 
multiplication by an interval of x 
gives a pure number - a probability. 
Mote particularly that it is not the 
probability of "finding the value x"; 
since we are dealing with a continuum, 
such a probability must be zero! Fig- 
ures 3.3 and 3.4 illustrate the proba- 
bility density functions corresponding 





to the cumulative probability functions 
of Figs. 3.1 and 3.2, respectively. 

The cross-hatched area in Fig. 3.4 is 
given by the integral of p(x) from x 4 
to x 2 . Using Eq. (3.2) and the funda- 
mental theorem of integral calculus, 
we have: 

f ** p(x) dx - /** (dP/dx) dx 
Jx l 

- P(x)|**- P(x 2 ) - P(x A ). (3.3) 

Hence the area under the probability 
density curve from x t to x 2 gives the 
probability of finding a value of x 
lying ? tween x A and x 2 . In particular, 
if we let x A go to — «•, we have, since 
P(-~) - 0: 

/** p(x) dx - P(x 2 ), (3.4) 

and if we let x 2 go to +-, we have, 
since P(+«) ■ 1 : 

f "° p(x) dx - 1 (3.5) 

•'-SO 

The equation is completely analogous 
to Bq. (2.3) of the discrete case, and 
is sometimes referred to as the "nor- 
malization" condition on the probabil- 
ity density function. 
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We nay define mean values in a 
fashion analogous to that use in Cnap- 
ter 2, but replacing sums by integrals; 
thus the mean value of x is: 

(x) - x p(x) dx, (3.6) 

and the standard deviation is given by 

a* - <(X - <x>)*> 

- /" (x - (x)) 8 p(x) dx 

- <x 2 > - <x> 2 (3.7) 

and in general, the mean value of any 
function of x by 

<f(x)> - /" *(*> P(*> d* (3.8) 

These formulae may be thought of as 
arising from the corresponding ones of 
the discrete case (Eqs. (2.5), (2.7), 
and (2.11)) by first breaking up the 
range of x into N intervals, and asso- 
elating the probability p(x k )Ax k with 
the ktb interval Ax k . Applying the 
discrete formulae, and then passing to 
the limit Ax k -+0 and K -*•», gives the 
integral expressions. 

Let us illustrate these notions 
'with a few examples. 



3.1.1 Uniform Distribution Between 
x - 0 and x - L 



Suppose a particle is "equally 
likely" to be found anywhere on the x- 
axis between 0 and L. We take this to 
mean that the probability density p(x) 
is constant between 0 and L, and zero 
elFcwiitre : 

( 0, (x < 0) 

p(x) - < C, (0 < x < L) (3.9) 
( 0, (x > L) 

Since condition (3.5) must be satis- 
fied, the constant C must be equal to 
1/L. The cumulative distribution P(x) 
is then given by application of Eq. 
(3.4): 



P(x) - /* w p(x») dx* 




9 



(x < 0) 

(0 £ x < L) . 
(x > L) 



(3.10) 



* 

) 



Figures 3.5 and 3.6 show graphs of 
p(x) and P(x), respectively. A simple 
calculation, using Eqs. (3.6) and 
(3.7), shows that (x) - L/2, as one 
would expect, and that o - L/V12 - 
0.289 L. 



3.1.2 Uniform Distribution in Angle 

Imagine a particle scattered so 
that its final direction of travel 
makes an angle 6 with its initial di- 
rection. We say it has been scattered 
through an angle 6 . If we are consider- 
ing scattering only in a plane, 6 may 
range from -ir to +jt, and a uniform dis- 
tribution would correspond to a density 
function p ( 0 ) - 1/2 jr, in analogy with 
the result in 3.1.1. Suppose, however, 
that we are interested in scattering 
in space, and that by uniformity of 
distribution we mean that all direc- 
tions in space, relative to the origi- 
nal direction, are "equally likely." A 
reasonable interpretation of what this 
means is the following: Consider a 
sphere; each point of the sphere de- 
termines by its radius vector from the 
center a direction of scattering, rel- 
ative to a fixed direct ionr determined 
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Fig. 3.8 
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Fig. 3.9 



by a fixed point on the sphere. Then 
we shall mean by "all directions equal- 
ly likely | " that equal areas on the 
sphere are equally likely; hence the 
probability of scattering through an 
angle between 9 and ( 6 + dfl), relative 
to a given direction, is to be equated 
to the ratio of the area of the sphere 
corresponding to such directions, to 
the whole area of the sphere. Refer- 
ring to Fig. 3.7, (preceding page) we 
seo that angles of scattering between 
6 and (0 + d0) correspond to a ring of 
radius R sin 9 and width R d 6 on the 
sphere; the area of the ring is there- 
fore 2wR 2 sin 9 d 9, and since the sur- 
face area of the whole sphere is 4wR 2 , 
we have: 

p(0) d0 - 2wR 2 sin 9 d0/4irR 2 

- \ sin 9 dfl, (0 < 9 < n ) . 

(3.11) 

Thus the probability density vanishes 
at both 0 and ir, and has a maximum at 
g/2, that is, it vanishes for directly 
forward or directly backward scatter- 
ing, and has a maximum at right angles 
to the incident direction. There are 
more ways, so to speak, in which scat- 
terings at right angles can occur, than 
forward and backward scatterings. 

3.1.3 Gaussian Distribution 

This distribution occurs fre- 
quently in a variety of applications. . 
We shall see it arising as a limiting 
case of the binomial distribution, and 
in the discussion of the distribution 
of errors of observation. It is also 
called the "normal distribution" by 



statisticians. It is defined by: 

p(x) - ex P -(*“* 0 ) a/2(j2 > 

(3.12) 

P(X) ’ Tiko* XI «P-C*'-*e> ! W * 9 <*'• 

The mean value of the distribution is 
x 0 and the standard deviation is o. 
Figures 3.8 and 3.9, respectively, give 
graphs of p(x) and P(x) ; in each one, 
two cases are included, corresponding 
to "small" and "large" values of a. 

The integral expression for P(x) in Eq. 

(3.13) cannot be expressed in terms of 
"elementary" functions, but itself de- 
fines a new function, values of which 
can be found in mathematical or statis- 
tical tables. 



3.2 CHANGE OF VARIABLE 

A frequently arising problem in 
dealing with distributions is that of 
finding the distribution of a quantity 
which is a known function of another 
quantity, whose distribution is known. 
Consider, for example, the angular dis- 
tribution of particles scattered in 
collisions. What determines its form? 

If the scattering can be treated by 
classical dynamics, the angle of scat- 
tering in any given collision is de- 
termined by the "impact parameter" of 
the collision, which is the perpendic- 
ular distance from the line along which 
the incident particle travels, to a 
parallel line through the scattering 
center. That is, there is a unique 
functional relation, whose form de- 



CONTINUOUS DISTRIBUTIONS 



25 



pends on the nature of the interaction 
between the incident particle and the 
scattering center, between the impact 
parameter b and the scattering angle 6. 
Figure 3.10 illustrates a possible 
pair of trajectories for a scattering 
event, with the impact parameter angle 
indicated for each. To determine the 
distribution of scattering angles, 
therefore, one must know the distribu- 
tion of impact parameters, and trans- 
late through the functional relation 
between the two. 

To state the problem in general 
terms, suppose we are given the prob- 
ability density function p(y) for a 
quantity y, and we are also given that 
y is related to another quantity x 
through a functional relation y *= f(x). 
What is the probability density func- 
tion for x? We will assume for simplic- 
ity initially that f(x) is monotone; 
that is, that it is a steadily increas- 
ing function. The reason for this is 
to assure that each value of x corre- 
sponds to only one value of y and vice 
versa. Figure 3.11 illustrates the re- 
lation between y and x. Referring to 
the figure, we see that the values of 
y lying between y 0 and (y 0 + Ay) cor- 
respond precisely to the values of x 
lying between x 0 and (x 0 + Ax) and to. 
no others, where y 0 = f(x 0 ), and Ay is 
related to Ax through: 

Ay = f ' (x 0 ) Ax, f’(x) = df/dx. (3.14) 

The probability that y lies between y 0 
and (y 0 + Ay) is p(y 0 ) Ay; this is iden- 
tical with the probability that x lies 
in its corresponding interval. Hence 
we have: 

probability x is in interval from x 0 

to (x 0 + Ax) - p(y 0 )Ay 

- p[f (x 0 )]f ' (x 0 )Ax. 

(3.15) 

This is in the form of a function of 
x 0 multiplied by the interval Ax; 
hence from the definition of the prob- 
ability density function for x, the 
function of x 0 multiplying Ax must be 



that density. Denoting it by q(x), we 
have : 

q(x) - p[ f (x)J f ' (x) . (3.16) 

We may. say: substitute f(x) for y both 
in the density and in the differential 
of y; the coefficient of the differen- 
tial of x is the density function for 
x. Suppose now that f(x) had been a 
decreasing function instead of an in- 
creasing one. The slope f ' (x) would 
have been negative, and if we used Eq. 
(3.16) unchanged, we would have a neg- 
ative probability, which is not al- 
lowed. The difficulty can be traced to 
Eq. (3.14), which gives the relation 
between the intervals Ax and Ay . We 
really only want the relation between 
their magnitudes ; for our purposes it 
is irrelevant whether the slope is 
positive or negative. We can take this 
into account by using the absolute mag - 
nitude of the slope in Eq. (3.14); we 
then have: 

Ay - | f ' (x) | Ax (3.17) 

and Eq. (3.16) would become instead: 

q(x) - p[ f (x)] | f ' (x) | . (3.18) 
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If now we consider a case in 
which y is not a single-valued func- 
tion of x, that is, more than one value 
of y corresponds to certain values of 
x, the probability density for x will 
have contributions from each y corre- 
sponding to that x. Figure 3.12 shows 
a possible situation of this kind. 

From the figure one sees that for x 
between x 4 and x 2 , there is one con- 
tribution to q(x) , whereas between x 2 
and Xj there are two contributions. If 
we distinguish between upper and lower 
branches of f(x) by calling them t x (x) 
and f s (x), as indicated in Fig. 3.12, 
then the probability density function 
for x is given by: 



q(x) 



( < x i < x < x a> 

{rfM x >JV< x > ♦ p[ f *< x >l 



f*’(*)|» 



(x, < x < x 8 ). 

(3.19) 





More complicated cases can be treated 
similarly . 

Let us give an instance of each 
of these possibilities. The scattering 
problem can be illustrated by a very 
simple model: Imagine particles scat- 
tered by a smooth sphere. The relation 
between impact parameter and scatter- 
ing angle is easily found, using the 
geometry of the problem. Figure 3.13 
shows a typical trajectory; the parti- 
cle is incident at impact parameter b, 
bounces off the smooth sphere of radium 
R with angle of "reflection" equal to 
angle of incidence, and goes off at 
angle 0 with its original direction. 
From the figure, we see that the fol- 
lowing relations hold: 

0 + 2ip - IT, 

and sin # • b/R. 

Hence we have for the relation between 
b and 0: 

b - R sin ij) ■ R sin [£(ir — 6)] 

- R cos (0/2) (3.20) 

We are now ready to translate the b 
distribution to 0 distribution. But 
what is a reasonable assumption about 
the distribution of ‘impact parameters? 
If we are thinking of the problem as a 
model for an atomic or nuclear scatter- 
ing experiment, we are not able to con- 
trol the b values for individual colli- 
sions; we simply have a beam of parti- 
cles of a certain mean intensity . This 
means that, in a plane perpendicular 
to the beam, equal areas have, on the 
average, equal numbers of particles 
incident on them. That is, the proba- 
bility density is uniform across the 
area of the beam. Relative to a given 
target particle (the sphere of our 
model), this means that the probabil- 
ity of the impact parameter lying be- 
tween b and (b + db) is given by the 
ratio of the area perpendicular to the 
beam corresponding to this range of b 
values, to the total area of the beam 
intercepted by the sphere. (In this 
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model, particles incident at values of 
b greater than R are not scattered at 
all, and will not enter into the dis- 
cussion.) Since the area of a ring of 
radius b and width db is 2irb db, and 
the area of a circle of radius R is 
irR 8 , we have: 

p(b) db - 2irt> db/irR 8 

-2b db/R 8 , (0 sb s R) . 

(3.21) 

We can now apply Eq. (3.16), to find 
the distribution of 6: 

,<9) - 2R cos (f)|^[« cos (|)] j/R* 

- 2 cos (6/2) J sin (6/2) 

- \ sin 6 (0 s6 sir). (3.22) 

Remembering the discussion of the uni- 
form angular distribution in section 
3.1.2, we see that the distribution 
above is identical with that found in 
Eq. (3.11) of that section; the angu- 
lar distribution of the particles scat- 
tered from a smooth sphere is uniform. 
This simple result is, so to speak, an 
accident arising from the particular 
form of interaction between the inci- 
dent particle and the scattering cen- 
ter. Other laws of interaction will 
give different angular distributions; 
in particular, the Coulomb interaction 
which enters when the particles both 
carry electrostatic charges is an im- 
portant case, but will be left to the 
Problems for discussion. 

For an illustration of the case 
of a multiple-valued relation between 
the quantities in question, consider • 
the following example. A simple har- 
monic oscillator is vibrating with am- 
plitude a and angular frequency u>, so 
that its displacement x at time t is 
given by : 

x - a sin cut (3.23) 

The period T is related to cu through 
off - 2 it. 



/ 

/ 




Suppose we look at , or take snap- 
shots of the oscillator "at random." 
What is the distribution of displace- 
ments we will observe? By "randomness" 
here, we mean that any instant is as 
likely to be chosen as any other for 
the snapshot; that is, the distribution 
of observation times is uniform, in the 
same sense as we used the term in sec- 
tion 3.1.1. From the uniformity of dis- 
tribution of observation times, we are 
to deduce the corresponding distribu- 
tion of displacements, given Eq. (3.23) 
relating the two. Figure 3.14 shows the 
relation between t and x. It is many- 
valued, but we may confine our atten- 
tion to a single period of the motion, 
since all periods are identical. Within 
one period, say from ~ T/2 to +T/2, 
there are two time intervals which cor- 
respond to any one given space inter- 
val, as shown in the figure. However, 
the symmetry of the curve is such that 
they will each contribute equally to 
the distribution of x, so that we can 
confine our attention to the half per- 
iod from -T/4 to +T/4, during which x 
increases monotonically from -a to +a. 

A uniform distribution of times in this 
interval corresponds to the density 
function: 

p(t) dt - (2/T) dt, (— T/4 St s+T/4). 

(3.24) 

The relation between t and x is found 
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by solving Eq. (3.23) for t: 

t - (1/w) arc sin (x/a). (3.25) 



; 

/ 

probability density for the observed 
positions reflects tills fact. 



3.3 RADIOACTIVE DECAY; MOLECULAR 
FREE PATHS 

An important family of distribu- 
tions which occurs in various contexts 
in physics can be illustrated by the 
problem of radioactive decay. Suppose 
we observe a substance containing ra- 
dioactive nuclei, which decay into a 
different species of nucleus with the 
emission of an alpha particle, say, at 
the moment of decay. What is observed 
is this: The number of nuclei which 
still survive at time t is an exponen- 
tial function of time. 

N(t ) - N 0 e‘ At . (3.27) 



Hence we have, applying Eq. (3.14): 

• (I) sr[(i) * rc sln (ir)] 

. (lA 1 I 

VuJT JjTzr (Z/IP » 




(3.26) 

This probability density function for 
x is sketched in Fig. 3.15. We see that 
it has a symmetrical minimum at x » 0, 
and that it rises asymptotically to « 
at x = — a and x - +a. This is no cause 
for alarm, if we remember that it is 
only areas under the curve that are in- 
terpreted as probabilities. Any inter- 
val (even one extending to a or —a) 
does indeed have a finite area under 
it; in fact, the area under the whole 
c^rve is 1, as it must be. The form of 
the curve shows that we are much more 
likely to see large displacements than 
small ones, if we look at random times; 
this is understandable when we remem- 
ber that the speed of the oscillator is 
large at the center and zero at the end 
points. It, therefore, spends more time 
at the ends than at the middle, and our 



That is, the fraction surviving after 
any given time is the same ftor each 
succeeding time interval of the same 
length, so tl'Zt, if half remain after 
•10 days, on^ quarter will remain after 
20 days, one eighth after 30 days, and 
so on. Stated otherwise, the number 
which decay between times 0 and t is 
given by: 

N dec - N 0 - N(t) - N 0 (l - e~ Xt ) 

(3.28) 

We interpret this in terms of the in- 
dividual nuclei by saying that the 
probability of decay between time 0 
and time t for each nucleus is: 

P(t) - 1 - e" Xt (3.29) 



This is then a cumulative probability 
distribution for the t ime of decay ; the 
corresponding probability density func- 
tion p(t) dt, which gives the probabil- 



ity that the decay 
t and (t + dt), is 
Eq. (3.2): 

p(t) dt - P(t + 

- P’ (t) 



occurs between times 
given as usual by 

dt) - P(t) 
dt 



dt 



(3.30) 



I 
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Figs. 3.16, 3.17. 3.18, respectively, 
give sketches cf p(t), P(t), and of 
[l — P(t)J . Notice that we can inter- 
pret Equation (3.30) in the following 
way: It is the product of two factors; 
the first, e _At , is, by Eq. (3.29), 
equal to [l - P(t)] f the probability 
of survival of tine t; the second, Adt* 
must therefore represent the probabil- 
ity of decay in the interval dt . How 
Eq. (3.29) gives, for sufficiently 
small tiae intervals At (saall enough 
rto that AAt « 1), by expanding the 
exponential function: 

P(At) - 1 - e" X(At> 

- 1 - [l - A (At) . . . ] 

- A (At) . (3.31) 

The remarkable fact of nature to ob- 
serve here is that A is independent of 
tiae, that is, the probability of de- 
cay in a saall tiae interval At is al- 
ways A (At), whether we are observing 
the nucleus iaaediately after its for- 
mation, r*v after a long time has al- 
ready passed. The nucleus has, so to 
say, no aeaory of its own "age" built 
in. Contrast this behavior with the 
distribution of ages in a biological 
population, like huaan beings. Here 
the probability of death in the next 
year, say, increases steadily froa 



birth on (except possibly in the very 
early part of life). That is, A is not 
constant, but increases with tiae. A 
simple aodel to describe this behavior 
may be made by assuming A to be pro- 
portional to the tine: 

A(t) - bt. (3.32) 

Then we have the probability of death 
between times t and (t + dt) given by 
the product of the probability of sur- 
vival up to time t, that is [l P(t)], 
with the probability of death in the 
interval dt : 

p(t) dt - [l - P(t)]A(t) dt, 

or, using Eq. (3.32) and the fact that 
p(t) - P' (t) : 

P’(t) - bt [1 - P(t)J, 
d/dt [l - P(t)J --bt [l - P(t)J. 



This can be integrated, bearing 
that P(0) - 0, to give: 


in aind 


1 — P(t) - exp — (bt 2 /2) 


(3.33) 


P(t) - 1 — exp — (bt 2 /2) 


(3.34) 


p(t) - P*(t) - bt exp — (bt 2 /2) . 


(3.35) 



These curves are sketched in Figs. 3.19 v 
3.20, and 3.21. Notice the siailari- 
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ties and differences between these and 
the corresponding curves for the case 
of radioactive decay in Figs. 3.16, 

3.17, and 3.18. The difference ‘between 
the two cases may be put in the follow- 
ing way: for the radioactive decay 
case, the probability of survival to 
time (t x + t 2 ) is given by exp -\{t x 
+ t 2 ) ; this may be expressed as the 
product of the probability of survival 
to time t 1# exp -At x , by exp -At 2 , 
which is the probability of survival a 
further time t 2 , but which is inde- 
pendent of t l l It is quite otherwise 
for the survival law given by Eq. 

(3.33); here we have instead: 

exp — b(tj + t a ) 2 /2 - exp -bt l 2 /2 

x exp— b(t x t 2 + t 2 2 /2), 

and the second factor, which gives the 
probability of survival a further time 
t 2 , clearly depends on t 4 . This will 
be the case for am£ survival law except 
the radioactive decay law. 

Lot us go on to calculate th© mean 
life of the decaying nuclei* This will 
be given by our standard equation for 
mean values: 

<t> - f~ *P<t> dt 

/ . «o —At . , 

tAe dt 

o 

- (l/A)/ " xe“* dx 

Here we have made the change of inte- 
gration variable x = At . The value of 
the definite integral occurring above 
is just 1, so we have: 

(t) = 1/A. (3.36) 

Thus the parameter A characterizing 
the decay is itself simply the recip- 
rocal of the mean life, and we may re- 
express the decay law iii the form: 

p(t)dt ** exp -t/(t> (3.37) 

p('i) - 1 — exp H t/(t). (3.38) 

The mean life in therefore the time in 



which the number of surviving nuclei 
falls to 1/e of its original value. 

Essentially the same types of dis- 
tribution are encountered in a quite 
different physical context, in which 
we have "distance of travel" in place 
of the "time" of the* decay law. Of 
course, a translation from time *o dis- 
tance could be imagined easily for the 
radioactive decay problem as follows: 
Imagine all the nuclei to have the 
same initial speed; then the distance 
they travel before decaying will be 
proportional to the time elapsed be- 
fore decay, and if we can observe the 
path lengths traveled until decay, they 
will have a distribution which will be 
of precisely the same form as our dis- 
tribution of decay times. (Something 
very much like this is actually done 
in fundamental particle physics, where 
the tracks of unstable short-lived 
particles are observed in bubble cham- 
bers or other devices, and mean lives 
are deduced from them.) But now we may 
imagine, instead of "decay," that the 
particle undergoes "interactions," or 
more simply, collisions which change 
its speed and direction of motion. This 
occurs continually, for example, in a 
gas; the molecules move in essentially 
straight lines (so-called "free 
paths"), interrupted from time to time 
by collisions which change the direc- 
tion of motion. Imagine following a 
molecule in its motion; there will be 
a sequence of free paths of varying 
lengths, which therefore have a cer- 
tain distribution, characterized by a 
certain mean free path. This distribu- 
tion is of considerable importance in 
the kinetic theory of gases, particu- 
larly in connection with processes like 
heat conduction and diffusion for which 
the collisions play an important role. 

We may anr.lyze the problem of the 
distribution of free paths in much the 
same way as the radioactive decay prob- 
lem. The probability that a collision 
will occur in a small interval dx is, 
we assume, simply proportional to the 
interval, and does not depend on how 
far the molecule has traveled since its 
last collision. The molecule has no 
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"memory" of its past. Thus we will 
write (Ax/L) for this probability, 
where L is a constant of the dimensions 
of length. Then if p(x) dx is the prob- 
ability that a collision occurs between 
x and (x + dx), and if P(x) is the cu- 
mulative probability that a collision 
occurs between 0 and x, so that [l — 
P(x)J is the probability that no col- 
lision occurs between 0 and x, we must 
have 

p(x) dx - [l — P(x)] (dx/L). (3.39) 

But 

p(x) dx - P(x + dx) - P(x) - P'(x) dx, 
hence 

P* (x) • [ 1 — P(x)J/L, 
or 

d[l — P(x)]/dx - — (l/L)[ 1 - P(x)J . 

(3.40) 

The solution ci this difft rential 
equation for which P(0) ■ 0 is: 

1 — P(x) - exp -x/L, (3.41) 

or 

P(x) - 1 — exp -x/L, (3.42) 

and 

p(x) dx - exp (-x/L) dx/L. (3.43) 

Thus the distribution of path lengths 
is exponential, exactly like the dis- 
tribution of radioactive decay times. 
Figures 3.16, 3.17, and 3.18 serve 
equally well to illustrate either one. 

The mean free path is given in the 
usual way by: 

(x) - f " xp(x) dx 

- f *° exp (-x/L) dx/L 

- L. (3.44) 

Thus the parameter L is simply the 



/ 

mean free path. Its actual value for 
gases at standard conditions is of the 
order of 10 cm. 



3.4 OTHER MOLECULAR PATH PROBLEMS; 

POISSON DISTRIBUTION 

We may go on to consider further 
questions associated with molecular 
paths such as the following: In a given 
length of the molecule's path, what is 
the probability of its undergoing pre- 
cisely 0,1,2, . . . n, . . collisions? 
What is the probability that the nth 
collision (from a given starting point) 
takes place between x and (x + dx)? 
Apart from the Intrinsic interest of 
these questions, their discussion will 
lead us to a connection with work of 
the previous chapter, namely to an im- 
portant limiting case of the binomial 
distribution. 

Consider the first question posed 
above. Given a length of a molecule's 
path, what are the respective proba- 
bilities that during that length of 
path the molecule has undergone 0 or 
1 or 2 ... collisions? If we denote 
these probabilities by Q 0 (x), Q 1 (x), 
Q 2 (x) , ... then we already know Q 0 (x); 
since it is the probability of no col- 
lision occurring between 0 and x, it 
is identical with [l — P(x)] of the 
previous section, and is given by 
exp -x/L. What about Q 4 (x)? There are 
many ways in which one collision can 
occur in a path length x; it can occur 
at any point between 0 and x. The total 
probability is given by the sum of all 
the probabilities of the alternative 
ways in which it can happen. If we de- 
note by dx { tne small interval in which 
the collision occurred, and let be 
its distance from the origin of the 
path, we can see with the help of Fig. 
3.22, that the probability that one 



( * ~ > | C — — -^| 

* -HK-*. 

rig. 3.33 



f 



DISTRIBUTION# 



collision occurs in dx 4 , and that no 
collision occurs elsewhere, is given 
by the product: (probability of no col- 
lision between 0 and x t ) times (prob- 
ability of a collision in dx 4 ) times 
(probability of no collision between 
x t and x), that is, by the expression: 



exp 



~<£) 



dx. 



exp 






This expression must be summed - that 
is, integrated - over all possible 
values of x t , i.e., from 0 to x, to 
get Q t (x), the probability of pre- 
cisely one collision occurring between 
0 and x. Hence we have 



q.go - J 0 ‘«p(-^-^ L r iL ') 



-x/l r 
e /, 



L 
z dx. 



dx, 

L 



-(e) 



e L 
-x/l 



(3.45) 



This function rises from zero at the 
origin to a maximum value at x - L, 
and then falls again as x increases 
further. This is to be expected: path 
lengths of the order of one mean free 
path are more likely to contain pre- 
cisely one collision than either longer 
or shorter paths. 

The subsequent functions Q 2 (x), 
Qj(x) and so on, can be found similar- 
ly. With the help of the sketch in 
Fig. 3.23, we can write the probabil- 
ity that collisions occur in dx t at x t , 
and in dx 2 at x 2 , and nowhere else as: 

dx, <**- > dx 2 

e *P L L 



exp 






x exp — 



(x - x 2 ) 



The total probability is given by in- 



tegrating over all values of x t and x 2 
from 0 to x with the restriction that 
x 4 < x 2 ; or alternatively, we can in- 
tegrate them without restriction, and 
divide the result by two to account 
for the spurious doubling which arises 
in this method from the interchange of 
%l and x 2 . In either case, we obtain: 

<».<*> - (£) /,“ dx > 
r-x. <*» - x, > <* - »2> 1 

X expJ-J- L - L J 



(*) 
I (!) 



-x/l r* 






2 -x/l 
e 



(3 .46) 



Carrying on similarly, we can find the 
general result for the probability of 
precisely n collisions occurring in 
the distance r: 



Qn(*) 



Mi) 



x\“ -x/l 
e 



(3.47) 



Notice that the total probability of 
all possible numbers of collisions in 
a given distance is unity, as it should 
be, independent of the distance: 



Q«<») 



- s* as 



,-X/L 



+x/l -x/l 
e • e 



(3.48) 



Hence the QnOO constitute a discrete 
distribution for each x; we may use 
them, for example, to calculate the 
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Bean number of collisions to be ex- 
pected in a distance x: 

<n> - £ nQ„(x) 

1:0 




Reasonably enough, the mean number of 
collisions in a given distance is 
found by dividing the distance by the 
mean free path. Figure 3.24 shows 
sketches of the Q„ plotted as functions 
of x; the predominance of small numbers 
of collisions for short paths, and the 
gradually increasing importance of 
larger numbers as the paths become 
longer, is evident from the curves. 

It is important to recognize, 
however, that Q n (x) * s po * » as ^ 
stands, a continuous distribution with 
respect to x, of the type we have been 
studying heretofore in this chapter; it 
is neither a probability density, nor 
a cumulative distribution. We could 
ask: What is the probability that the 
n'th collision takes place between x 
and (x + dx)?, or what is the probabil- 
ity that the nth collision occurs be- 
tween 0 and x? These functions would 
be continuous distributions in x; a 
density, and a cumulative distribution 
respectively . 

Let us denote by P n (x) dx the 
probability that the nth collision 
takes place between x and (x + dx) ; it 
must be given by tbe product of the 
probability that precisely (n - 1 ) col- 
lisions take place in the distance x, 
by the probability that one collision 
takes place in dx. But the former is 
q b 1 (x) and the latter is (dx/L) ; hence 

p»<*> o* - «.-.<*) (if) 





<3 .50) 



We see that the first one of the fam- 
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ily, Pj (x) , is identical with the p(x) 
of the previous section Eq. (3.43), 
as it should be, and that all the p„'s 
satisfy the required normalization con- 
dition: 



P«(*) dx 
1 

(n - i): 





- 7 1 , -77 • (n - I)! - 1 (3.51) 

(n - 1 ): 

Hence we may calculate averages in the 
usual ways; for example, the mean dis- 
tance to the nth collision is given 
by: 



<*> - f“ * <*» 





-a/W dx\ 

e It) 



L n! 

“ (n - 1 ): 



nL. 



(3.52) 



This result is of course closely re- 
lated to the result in Eq. (3.49), but 
they are by no means identical. Imagine 
regarding the two formulas as prescrip- 
tions for an experimental determination 
of the mean free path L; then the first 
says: "Take many sections of path, each 
of length x; find the average number of 
collisions in each, and divide it into 
x to obtain L." The second says: ’’Pick 
a number n of collisions; then measure 
many times the path required to give 
that number. Divide n into the average 
path length to obtain L.” 

Similarly, the two dis ributions 
p B (x) and- Q a (x) are closely related, 
but their difference must be clearly 
understood. Q*(x) is » discrete dlstri- 
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but ion with n as the index enumerating 
the various alternatives; x enters as 
a parameter, fixed in advance, though 
capable of taking on a continuous range 
of values from 0 to «. It answers the 
question " Given x , what is the distri- 
bution of n?" But p„(x) is a continu- 
ous distribution in x - a probability 
density function - in which n enters 
as a parameter, fixed in advance, 
which may have any integer value. It 
answers the question " Given n , what is 
the distribution of x?” 

The discrete distribution Q n (x) is 
known to statisticians as the Poisson 
distribution . It is related to the 
asymmetric binomial distribution of 
section 2.4 in a way which can be un- 
derstood by using a somewhat different , 
approach to the collision problem. 
Imagine the length x of path divided 
into a large number M of segments ; M 
is to be taken large enough so that 
(x/M) , the length of each segment , is 
small compared to the mean free path L. 
Then for each segment two alternatives 
are possible: either there was a col- 
lision in that segment, or there was 
not; the former alternative has the 
probability (length of segment )/ (mean 
free path), i.e., p = x/ML; the latter 
has probability q = 1 — p « 1 — (x/ML). 
Since there are M segments, the distri- 
bution of n, the number of occurrences 
of collisions, is given by the asym- 
metric binomial distribution (Equation 
(2.27): 

p - - a r - - n): pD < 3 - 83 > 

Now p is very small; furthermore, we 
are interested in the limit in which 
M — With this in mind, we rearrange 
the terms of Eq. (3.53) as follows: 



M(M - 1)(M - 2 ) ••• (M - n + 1) 

P “ " nl 

-(=)•<> -=r*(E)'[‘ -(=)]• 

f (1 - 1/M) (1- 2/M) • . •[ 1 - (n- 1)/Ml ) 

1 [1 - (x/ML)]" J 

(3.54) 



Now for fixed n, the term in the curly 
brackets approaches unity as M -* 00 ; 
and in the same limit, the factor 
[l — (x/ML)] m approaches e~ x/L , accord- 
ing to one of the definitions of the 
exponential function. We have there- 
fore in the limit : 




and we see that this is identical with 
the Q n (x) deduced in a different way 
previously. Thus the Poisson distribu- 
tion is a limiting case of the binomial 
distribution in which one of the alter- 
natives has an extremely small proba- 
bility. As such it has many applica- 
tions other than the one we have been 
discussing here. 



3.5 GAUSSIAN DISTRIBUTION; ERRORS 



There exists a different approxi- 
mation to the binomial distribution 
from the one considered in the previous 
section, which is of great importance. 

It is the limit in which M is large, 
and the probabilities in the neighbor - 
t jod of the maximum of the distribu- 
tion, which occurs at M/2 in the sym- 
metric case, are of primary interest. 
Furthermore, we are often interested 
in regarding the index k of the dis- 
tribution as continuous rather than as . 
discrete; this can be made meaningful 
when M is sufficiently large, when 
changes of several units in k corre- 
spond to only very slight changes in 
the associated probability p k „ Let us 
see how to handle this approximation. 

We will use the symmetric case for 
simplicity; the general asymmetric 
case can be done similarly. 

We begin by considering the ratio 
of the probability for arbitrary index 
k to the probability at the maximum, 
which occurs at k « M/2. According to 
Eq. (2.20), this is given by: 



Ph [ (M/2) l] 2 
p M/t " ki (M - k)I* 



(3.56) 



It will be useful to introduce an in- 
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dex m which is measured from the peak 
of the curve : 



m - k - (M/2) 



(3.57) 



Then we have 
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(m/2): (m/2): 



[ (M/2) + m] I [ (M/2) - mj : 

(£m) (£m— l) (£ m— 2) . . . (£M—m+l) 
(&M+m) ( £M+m— 1 ) (£M+m-2 ) . . . ( &M+1 ) 

l[ 1— (2/M)] [ 1— (4/M)J . . .[ 1-2 (m— 1)/Mj 
[ l+(2/M)J [ l+(4/M)J . . .[ l+(2m/M) J 

(3.58) 



We have written it this way to show 
that if m is much less than M - which 
is the case in the central part of the 
distribution that we want to approxi- 
mate to - then each factor in both the 
numerator and the denominator is close 
to unity. However, there are many fac- 
tors, for even if m is much less than 
M, it can be itself large compared with 
unity. It will be helpful, since we 
have a product of many factors, to 
take the logarithm, since the loga- 
rithm of the product of several fac- 
tors is equal to the sum of their log- 
arithms. Doing this, we obtain 



in (-^) 


■ S- [*-(?)] 


\Pm/*/ 


jr, L V " /J 



-£»■[■ • (I 1 )] 

- tl¥ -i(W — ] 

J X 

Here we have used the Taylor expansion 
of the function In (1 + x) for small x. 
Now using the summation formulas 



i- 1 



£ j - m(m - l)/2 
J ■! 



/ 



and 



Y* j - m(m + l)/2, 

J = i 

and neglecting all but the first term 
in each sum because the higher terms 
give rise to expressions proportional 
to m 3 /M 3 , which is much less than 
(m 3 /M), we obtain. 

p« (2/M)m(m-l) (2/M)m(m +1) 

ln 5 2 

Pm / a 2 2 



— 2m z /M, 



or : 



and 



p./p./a » exp(-2m 2 /M) 

P* “ P«/a ex P C -2 ® 2 /**) 



(3.59) 



If we reexpress this in terms of the 
original index k, we have: 

Pic “ p m/ 2 ex P {” 2 l k “ (M/2)] 2 /m} . 

(3.60) 

This is evidently of the Gaussian 
form defined in Eq. (3.12), except 
that it is a discrete distribution in 
an index k taking on integer values 
rather than a probability density in a 
continuous variable. We may take the 
transition to that form by observing 
that if M is large enough so that 
there exist intervals Ak which, sTl- 
t hough they contain several values of 
k (that is, they include several con- 
secutive integers), they are still suf- 
ficiently small compared to M so that 
Pic changes very little in that inter- 
val, then the probability that the in- 
dex lies between k and (k + Ak) is 
simply p k Ak. If we now rewrite this in 
the style of a density function, we 
have : 

p(k) dk 

- P«ax ex P { -2 f k ” (M/2)] */M} dk, 

(3.61) 

which is now precisely of the form of 
a Gaussian density function, with mean 
M/2 and standard deviation o* • M/4. 
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Notice that these values are identical 
with those found in Chapter 2, Eqs. 
(2.22) and (2.24). The constant p,*, 
which occurs in the distribution has of 
course a known value; but it is sim- 
pler to observe that the normalization 
condition requires it to be: 

p, , - V27*M. (3.62) 

Thus the Gaussian distribution appears 
as a limiting form of the binomial 
distribution, appropriate for large M 
and in the neighborhood of the maxi- 
mum of the distribution. 

One of the areas of application 
of the Gaussian (or "normal , " as the 
statisticians call it:) distribution 
is in the discussion of experimental 
error. It is often said that the meas- 
ured values of any physical quantity, 
if a sufficient number are accumulated, 
form a Gaussian or normal distribution 
about the "true" value. Leaving aside 
any deeper discussion of the question 
of what one means by the "true" value, 
let us simply suppose that there does, 
in fact, exist a correct value q 0 
(possibly determined by using more re- 
fined apparatus) of the quantity be- 
ing measured, and let us inquire what 
might be the cumulative effect of 
various possible sources of error in 
the measurement . A very crude model of 
the process is the following: Suppose 
there are M sources of error in the 
experiment, and each gives rise to an 
error +€ or — € with equal probability. 
The actual result of an individual 
measurement of the quantity q will 
then be: 

q - q 0 + (k - J ) € , 0.63) 

where k is the number of times +€ oc- 
curs, and j is the number of times 
that — € occurs. Now k + j - M; hence 

q - q 0 + (2k — M)€. (3.64) 

Now the repetition M times of the 
choice +€ or -c gives rise to a bi- 
nomial distribution for k, the number 
of +€ choices; hence 



_ m! l_ 

P * “ k! (M - k) ! 2 " 



(3.65) 



is the probability that an individual 
measurement will yield a value of q 
corresponding (through Eq. (3.64) to a 
given k. Rewriting this in the approx- 
imation derived earlier, we have 



p(k) dk 

- V2/jtM exp {— 2[k — (M/2)] 2 /M} dk. 

Now let us use the results of section 
3.2 to convert from a distribution in 
k to a distribution in q: 

p(k) - S$7n M 

X exp - l (2/M) (q - q 0 )/2c] 2 dq/2€. 

Hence the probability density function 
for q is given by 

p(q) = Vl/2irMc 2 exp [— (q - q 0 ) 2 /2Mc 2 ] , 

(3.66) 

so that the distribution of q values is 
is indeed Gaussian, with mean q 0 and 
standard deviation Vm €. 

This is, of course, a very crude 
and unrealistic model of the "error" 
problem; nevertheless, it does happen 
that even under considerably less re- 
strictive assumptions, one arrives at 
a Gaussian distribution of errors. 

This does not, of course, guarantee 
that in any particular experiment the 
distribution of values obtained neces- 
sarily follows the "normal" distribu- 
tion; but it is indeed observed to 
hold in sufficiently many situations 
to make it a very useful first assump- 
tion. 



3.6 JOINT DISTRIBUTIONS; MAXWELLIAN 
VELOCITY DISTRIBUTION 

In the discrete case, we studied 
the* .joint distribution arising in the 
problem of assignment of molecules to 
several cells into which a volume of 
gas had been divided. The collection 
of integers k p k a , k a , ... k| giving 



the number of molecules occupying each 
cell was found to occur with a proba- 
bility p(kj, k 2 , ... k,,) which depended 
.jointly on all the k 's; this was 
called therefore a "joint distribu- 
tion." In the continuous case the anal- 
ogous situation occurs; we often have 
to deal with joint distributions of 
several variables. An example to which 
we will return later is the distribu- 
tion of the velocity components of a 
molecule in a gas. Each molecule has a 
velocity vector specified by the three 
components along the thre*» axes of ref- 
erence, say Vj| , Vy , and . We need to 
know the probability that simultaneous- 
ly the x component has a valu«J lying 
between v s and (v x + dv x ), the y com- 
ponent has a value between Vy and 
(v y + dvy), and the z component has a 
value between v* and *v s + dv s ). Or we 
may be interested in its speed end di- 
rection, and therefore inquire about 
the probability of simultaneously find- 
ing its speed between v and (v + dv) 
while its direction lies within the 
solid angle defined by the intervals 
from 0 to (0 *» d0) and <f> to (<f> + d<t>) 
of the spherical polar angles defining 
its direction with respect to a fixed 
polar axis- 

In general, we define the joint 
probabilit y density of two quantities 
by the function of two variables 
p(x,y ) such that p(x,y) dx dy gives the 
probability that the first lies between 
x and (x + dx) and the second lies be- 
tween y and (y + dy). The generaliza- 
tion to more variables is made in the 
obvious way: p(x,y,z) dx dy dz for 
three, and so on. The density must sat- 
isfy a normalization condition: 

r f *° p(x,y) dx dy - 1, (3.67) 

J _«o J—oe 

which assures us that the two quanti- 
ties are certain to be found somewhere 
in their range. The probability that x 
and y lie within any region R of the 
x-y plane is given by the integral : 

// p(x,y) dx dy, 

a 

where the integration symbol // means 



a two-dimensional integral over the 
region R. 

From such a joint distribution, 
the distribution of either variable 
separately may be found; for the prob- 
ability that the first one lies be- 
tween x and (x + dx), irrespect ive of 
the value of y , is given by 

dx f " p(x,y) dy. (3.68) 

This follows by taking R to be the 
whole region corresponding to x lying 
in the interval dx while y is anywhere; 
this is simply a strip of width dx run- 
ning parallel to the y axis, and the 
corresponding integration is just what 
is expressed in Eq. (3.68). Similarly 
the probability that the second varia- 
ble lies between y and (y + dy), irre- 
spective of the value of x, is given 

by 

dy / * p(x,y) dx. (3.69) 

Each of these is properly normalized 
by virtue of Eq. (3.67). 

Mean values of functions of x and 
y are defined in a way which is the ap- 
propriate generalization of the single- 
variable Eq. (3.8) to the case of two 
variables: 

(f (x,y)> - f(x,y> p(x,y) dx dy. 

(3.70) 

In addition to the means (x),(y), a x z 
- (x 2 ) - (x) 2 , and a y 2 - (y 2 ) — (y> 2 » 
which we have used frequently, a new 
average called the "correlation coef- 
ficient" and defined by the equation: 

r - (<xy> ” (x)(y))/o x o y (3.71) 

becomes important ; it is an indicator 
of the degree to which the two varia- 
bles are independent of one another . 
They are independent if p(x,y) is of 
the form of a product p x (x) p 2 (y); for 
then the probability of the joint 
event "x in dx and y in dy" is simply 
the product of the probabilities of the 
events "x in dx" irrespective of y, 
and "y in dy,” irrespective of x. Since 
the probability of Joint events is the 
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product of the individual probabilities 
only if the events are independent, the 
product form Pj (x) p a (y) for the joint 
probability indicates independence of x 
and y. We see that r will vanish in this 
case, for then we will have (xy) = 
(x)(y). If, however, x and y are "cor- 
related," then r need not vanish. If, 
for example, there is a tendency for 
positive x to occur with positive y, 
and negative x with negative y, then 
r will be positive. 

Let us give some simple examples 
of joint distributions. 



3.6.1 Uniform Distribution Over a 
Rectangular Area 

Suppose a point is to be picked 
"at random" from the rectangle in the 
x-y plane defined by 0 < x <L t , 0 < y 
< l 2 ; what joint probability density 
function describes this? "At random" 
means here that the point is equally 
likely to be found in any two areas of 
the same size within the rectangle; 
hence p(x,y) is constant over the rec- 
tangle, and zero elsewhere. Bearing in 
mind the normalization condition, Eq. 
(3.67), we may write: 



likely to be found in equal areas on 
the sphere; therefore to express the 
probability density function in terms 
of 0 and 0 we need to find what frac- 
tion of the total area of the sphere 
corresponds to the area on the sphere 
defined by the range d 6 of 0 and d 0 
of <f>. Figure (3.25) shows the relevant 
area; we see that it has dimensions 
R dd by R sin 0 d0. The area of the 
sphere is 4 ttR 2 ; hence the ratio of the 
two, which gives the probability that 
the point lies in the range of 0 and <f> 
specified, is given by: 

p(d,4>) dd d <p - (R dd) (R sin d d0)/4irR 2 

- sir d dd d0/4ff. (3.73) 

Notice that here also the variables 
are independent. Furthermore, we may 
work out the distribution of d alone; 
according to Eq. (3.68) this is given 
by : 

dd f**p(d,0) d <p - (sin 5 dd)/4ff [** d<f> 
J o J o 

- 2 k (sin d dd)/4ff 

« sin d dd/2, 



p(x,y ) dx dy 



j(d* dy)/V, 2 {° ** 

lo 



y --2 

elsewhere 

(3.72) 



In this case the variables are inde- 
pendent, since p(x,y) can be written 
as the product p x (x) p 2 (y), where p 4 
and p 2 are each of the form of the uni- 
form one -dimensional distribution given 
by Eq. (3.9). The various averages can 
be found easily. 



3.6.2 Uniform Angular Distribution 

Directions in space may be speci- 
fied by the spherical polar coordinates 
d and </> which give the position of the 
point on a sphere at which a ray from 
the origin intersects a sphere. By a 
"uniform angular distribution" we mean 
that the intersection point is equally 



which agrees with the result of sec- 
tion 3.1.2, Eq . (3.11). 

Just as for distributions in one 
variable, the question of change of 
variables arises for joint distribu- 
tions. We saw in section 3.2 how to 
handle such a change ; the method is 
simply to substitute the change of 
variable, both in the density function 
and in the differential which accom- 
panies it. The coefficient of the dif- 
ferential of the new variable is then 
the density function in the new vari- 
able . 4 

For joint distributions in two or 
more variables an analogous procedure 
is appropriate, but one must remember 
the method of transforming an element 
of area or volume from one set of vari- 
ables to another. If p(x,y) dx dy is 
the density function in x and y, and if 
they are each functions of two new 
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variables u and v whose distribution 
is required, 



x - f(u,v) 
y - g(u,v), 



then in the calculus of two or more 
variables we learn that the method of 
transforming elements of area from the 
x— y plane to the u — v plane is this ! 

af/au af/av 



dx dy 



ag/au ag/av 



du dv. (3.74) 



The determinant of the partial deriva- 
tives which enters is called the Jacob- 



ian determinant of the transformation, 
and is often denoted by J(f,g/u,v). Us- 
ing this notation, we may write the 
transformation of the density function 
thus: 

p(x,y) dx dy - p [ f (u,v) ,g(u, v)] 

x J(f ,g/u,v) du dv, (3.75) 

so that the new density function q(u,v) 
is given by: 

q(u, v) - p[f(u,v),g(u,v)j J(f,g/u,v). 

(3.76) 



If, for example, we wish to 
transform from (x,y) to polar coordi- 
nates in a plane (r,0), for which x m 
r cos 0 * y " r sin 0, we find that 
dx dy ■ r dr d0, and p(x,y) dx dy * 
p(r cos 0, r sin 0) r dr d0, so that 



q(r , 0 ) «= r p(r cos 0, r sin 0). 



In a similar fashion the transforma- 
tion from a distribution in three var- 
iables (x,y,z) to one in the corre- 
sponding spherical polar coordinates 
(r,0,4>), for which the equations re- 
lating the coordinates are 



x - r sin 0 cos <f>, 
y ■» r sin 0 sin <f>t 
z ■ r cos 0, 
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will result in a volume element trans- 
formation 



dx dy dz - r 2 sin 0 dr d0 d 0, 



so that if the original density func- 
tion is p(x,y ,z) , the transformed one 

is : 



q(r ,0,<p) m p(r sin 0 cos <f>, 
r sin 0 sin <f>, r cos 0)r 2 sin 0. 



Notice that if, for example, the den- 
sity function in the polar coordinates 
were required to be uniform in angle, 
we would have to require that p be of 
such a form that no angular dependence 
arise from it, since the sin 0 factor 
from the volume element already gives 
the form of angular dependence required 
by Eq. (3.73) to describe a uniform an- 
gular distribution. What this means is 
that p can depend on (x,y,z) only 
through the combination (x 2 + y 2 + z ), 
which is of course just r 2 and is in- 
dependent of angle. 

As a final example of a joint dis- 
tribution, we will discuss briefly the 
distribution of molecular velocities 
in a gas. This distribution, which is 
very important in the kinetic theory 
of gases, is generally known as the 
"Maxwellian" distribution in honor of 
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James Clark Maxwell, who was one of the 
first physicists to explore the kinetic 
theory mathematically. A number of dif- 
ferent derivations exist; we shall give 
here one similar to one of Maxwell’s 
early discussions. It is not to be re- 
garded as a rigorous proof, because a 
number of unproved assumptions will be 
made in the course of the argument . 
Nevertheless it is interesting, and it 
illustrates the use of arguments of a 
type much used in modern physics - 
arguments of symmetry. 

Let us then denote by p(v x , " x v # ) 
the joint probability density for find- 
ing the x component of velocity of a 
molecule between v x and (v x + dv x ), 
the y component between v y and 
(v y + dVy), and the z component be- 
tween v s and (v x + dv z ) . It will be 
useful to imagine a "velocity space," 
that is, a set of Cartesian coordinates 
corresponding to each velocity compo- 
nent; a given point in this space cor- 
responds to the velocity vector extend- 
ing from the origin to the point. The 
point may also be described by spheri- 
cal polar coordinates; ( v,0,0 ) instead 
of (v x , v y , v x ); in this case v repre- 
sents the speed, or magnitude of the 
velocity vector (v 2 - v x 2 + v y 2 + v* 2 ), 
and 6 and <f> are the polar angles cor- 
responding to the direction of the ve- 




nt. 3.26 



0 

f 

locity. Figure 3.26 shows the relation 
between the various coordinates. 

What general arguments can we give 
about the form of the distribution 
function? First ; p(v x , v y , v z ) should 
be of such a form that all directions 
of motion are equally probable for the 
molecule; we do not expect nature to 
have any preference for one direction 
over another. This means that the angu- 
lar distribution 3hould be uniform , in 
the sense in which we used the term 
earlier. This means that p must depend 
on the velocity components only through 
the speed v, since as we saw before, 
the transformed volume element in polar 
coordinates already describes a uniform 
angular distribution. We can express 
this result by writing 

p(v x , v y , v e ) - G(v 2 ) 

- G(v x 2 + v y 2 + v z 2 ), (3.77) 

where by G we mean a function of a 
single variable. Second : We will sup- 
pose that v x , Vy and v z are independ - 
ent , that is, that the distribution of 
any one of them is the same for fixed 
values of the other two, regardless of 
the particular values chosen for the 
other two. As we saw earlier in this 
section, independence implies that the 
joint distribution is a product of in- 
dividual distributions: 

p(v x , Vy , v z ) = ^*i (v x ) F g ( v y ) F 3 (v z ). 

This is the weakest assumption of this 
method of arriving at the Maxwellian 
distribution. It is by no means obvious 
that if we examine those molecules with 
Vy = v z “ 0, we will find precisely the 
same distribution of values of v that 
we will find if ve examine those mole- 
cules with, say, large values of v y and 
and v z . Nevertheless, the randomizing 
effect of the molecular collisions does 
indeed have the net effect of making 
the distribution of the components in- 
dependent . It is not easy to demon- 
strate this in a simple way, so we 
shall simply assume it . Third : the in- 
dividual distributions , F,, and F a 
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must actually be identical, for the 
labeling of the axes is quite arbi- 
trary; it clearly cannot matter which 
direction we happen to call x. The 
distribution of the x component must 
be the same as that of the y component. 
Thus: 

p(v*, v y , v*) - G(v* 2 + v y * + v,*) 

- F(v*) F ( v y ) F(v*). (3.78) 

The remarkable fact is that these as- 
sumptions completely determine the 
fox i of the distribution function. 
Notice first that if we set v y - v* - 0 
in Eq. (S.78), we get a relation be- 
tween the functions F and G: 

G(v* 2 ) - F(v*) [F(0)J 2 1 (3.79) 

which, when substituted in Eq. (3.78) 
for F, gives us a relation for the 
single function G: 

G(v* 2 + v y * + v* 2 ) 

- G(v* 2 ) G(v y 2 ) G(v* 2 )/[F(0)] # .(3.80) 

It will simplify matters a bit if we 
temporarily use (u,v,w) in place of 
the squares of the velocity components. 
If we also notice that from Eq. (3.79) 
we have G(0) - [F(0)] 3 , and if we let 
g (u) - G(u)/G(0), then Eq. (3.80) can 
be rewritten as: 

g (u + v + w) - g(u) g(v) g(w). (3.81) 

The only possible function satisfying 
this relation is an exponential func- 
tion, as we see if we differentiate 
both sides with respect to u, and then 
set u - v - 0; this yields: 

g'(w) - g*(0) g(w). (3.82) 

The 3 G?ution of this differential 
equation for which g(0) * 1 is: 

*(.) - .***, » - -g'tt). (3.83) 

Returning to the original notation and 
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Inserting the result into Eq. (3.76) . 
we obtain the distribution function: 

p(v„, Vy, v*) - G(v 2 ) - G(0) g(v 2 ) 

- G(0) exp -av 2 

- G(0) exp -a(v* 2 ♦ v y 2 + v« 2 ) 

(3.84) 

The constant G(0) is determined by the 
^irmallzation condition: 

1 - S.lLlI.1 p dv « dv » dv * 

- G(0) ex P ("SV* 2 ) dv i 

X f_2 eX P (” SV y 2 ) dV y 

x f_2 exp (-av* 2 ) dv* 

- G(0) (x/a)* 2 , 

hence, the final form of the distribu- 
tion function is: 

p(v*, vy, v*) - (a/x)** 

x exp -a(v x 2 + Vy 2 + v* 2 ). (3.85) 

This is the Maxwellian distribution; 
we see that it is "Gaussian" in each 
velocity component. The constant a re- 
mains undetermined; evidently it has 
to do with the width of the distribu- 
tion. that is, the mean square veloc- 
ity. In the kinetic theory one shows 
that it must be inversely proportional 
to the absolute temperature of the gas. 

We may write the distribution in 
the polar form; using the fact that 
the volume element becomes v 2 sin 0 dv 
d6 in spherical poiar coordinates, 
the distribution becomes: 

q(v, 0 , & - (a/#) 3 / 2 exp (-av 2 ) 

x v 2 sin 0 . (3.86) 

By integrating over the polar angles. 
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we can get the distribution of speed 
alone : 

q(v) - T'/'qCv, 0, <t> ) d» d« d « 

J 9 

- (a/t)^ 2 exp (-av 2 )v 2 . 

X /** f* sin e d0 d0 

•'0 

- 4a*/*/# 1 / 2 v 2 exp -av 2 . (3.87) 



f 

From these results we can work out the 
various averages of interest : mean 
speed, mean square speed, mean square 
of a velocity component, and so on. 

The ‘role of the Maxwellian distribu- 
tion of velocities is fundamental in 
many fields of physics, and the gen- 
eralization to the Maxwell-Boltznann 
distribution which is made in statis- 
tical mechanics underlies the whole of 
thermal physics. 
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3.1 For the case of scattering of a 
particle with charge q A at a fixed 
scattering center with charge q 2 » 
by virtue of the electrostatic in- 
teraction between them, the rela- 
tion between impact parameter b 
and scattering angle 0 is given by 
classical dynamics to be b - 

(q, q 2 /mv 2 ) cot (0/2), where m is 
the mass and v the speed of the 
scattered particle. Find the angu- 
lar distribution of the scattered 
particles. (This is the "Rutherford 
scattering" from which Rutherford 
deduced the existence of small, 
massive, charged nuclei in atoms.) 

3.2 Locking out the window, I see ob- 
jects falling past me. Measuring 
their speeds, I find 'hat they are 
distributed uniformly etween 

40 ft/sec and 80 ft/sec. 

(a) Assume they are being dropped 
from rest at various heights above 
me. What is the distribution of 
heights required to give my ob- 
served distribution of speeds? 

(b) Assume Instead that they are 
all being dropped from 25 feet 
above me, but with various initial 
speeds. Vhat distribution of ini- 
tial speeds will account for my 
observed distribution of speeds? 

3.3 For each of the distributions dis- 
cussed in section 3.4, find the 



standard deviation, for Q B (x) this 
will mean finding (n 2 ) , and for 
p n (x) dx, finding (x 2 ) . In each 
case compare the standard deviation 
with the mean value itself, and use 
this result to formulate instruc- 
tions for the experimenter inter- 
ested in measuring the mean free 
path to an accuracy of about 5% in 
the way discussed immediately fol- 
lowing Eq. (3.52). 

3.4 The counts produced by a Geiger 
counter exposed to a steady source 
which gives an average rate of one 
count in time T, form a time se- 
quence governed by the same Poisson 
distribution that governs the mo- 
lecular paths discussed in the 
text; it is only necessary to sub- 
stitute t for x and T for L. Vhat 
is the probability that in a time 
3T there occur (a) 2 or less 
counts? (b) Exactly 3 counts? 

(c) 4 or more counts? 

3.5 The probability that the first 
collision takes place between 0 
and x is given by P(x), Eq. (3.42). 
Call this P t (x) ; then find P 2 (x), 
the probability that the second 
collision takes place between 0 
and x. (Hint: This is the cumula- 
tive distribution associated with 
the density p 2 (x). ) At what dis- 
tance will one have 95% probability 
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that the first collision has oc- 
curred? The second? 

3,6 Suppose a needle of unit length is 
dropped onto a plane marked with 
parallel lines unit distance apart. 
It falls "at random" on the plane. 
Take this to mean that the joint 
probability of finding the center 
of the needle at a distance be- 
tween x and (z + dx) to the right 
of a line and of finding the angle 
which the needle makes with lines 
to lie between 0 and (0 + d0), is 
uniform over the range 0 < x < 1 
and -*/2 < 0 < +vr/2. Find the prob- 
ability that the needle intersects 
a line, and the probability that it 
does not intersect a line. ( Hint : 
You will need to find the area R 
in the x-0 plane corresponding to 
the required conditions.) 

3.7 Suppose x is uniformly distributed 
between 0 and 1 , and y is also 
uniformly distributed between 0 
and 1, and they are independent of 
one another. What is the joint 
probability density for x and y? 
Find the region R of the x-y plane 
for which the averse (x+y)/2 lies 
between 0 and s. What is the prob- 
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ability P(s) that the average lies 
in this region? What is the prob- 
ability density function p(s) ds 
of the average? Find o 2 for the 
original distribution of x (or y), 
and find a 2 for the distribution 
of the average s. Speculate on the 
probable behavior of a 2 as the num- 
ber of identically distributed in- 
dependent variables averaged ln - 
creases . 

3.8 (a) Show that the mean square of 
each component of the velocity of 
the molecules in a gas is one third 
of the mean square speed. ( Hint : It 
is not necessary to perform any in- 
tegrals; use the relation between 

'■ v 2 and the squared components, the 
definition of the mean, and a sym- 
metry argument.) 

(b) From statistical mechanics we 
know that the mean kinetic energy 
£m(v 2 ) of the molecule in a gas 
must be equal to 3kT/2, where k is 
Boltzmann's constant and T is the 
absolute temperature. Use this to 
establish the relation between the 
constant a appearing in the Max- 
wellian distribution and the abso- 
lute temperature T. 
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